US Jobs US Jobs     UK Jobs UK Jobs     EU Jobs EU Jobs


Site Reliability Engineer [Multiple Positions Available]

DESCRIPTION:

Duties: Collaborate with business partners to understand their systems and processes.

Identify toil and gaps in the existing Telemetry solutions.

Propose detailed automation solutions to fill the gaps and to reduce the toil.

Drive the development and delivery of SRE solutions to develop and maintain advanced telemetry solutions, design visualization tools for telemetry and observability, engineer machine learning solutions to automate telemetry systems.

Stay informed about the latest trends in Artificial Intelligence, Machine Learning, Large Language Models, Generative Artificial Intelligence and operate with a continuous improvement mindset.

Collaborate with stake holders and vendors to finalize SLO (Service Level Objectives) and SLI (Service Level Indicators) of any given product.

QUALIFICATIONS:

Minimum education and experience required: Master's degree in Computer Science, Computer Engineering, Information technology or related field of study plus 3 years of experience in the job offered or as Site Reliability Engineer, Java Developer, Software Engineer, Support Engineer, Java Developer, Programmer Analyst, Application Developer, or related occupation.

The employer will alternatively accept a Bachelor's degree in Computer Science, Computer Engineering, Information technology or related field of study plus 5 years of experience in the job offered or as Site Reliability Engineer, Java Developer, Software Engineer, Support Engineer, Java Developer, Programmer Analyst, Application Developer, or related occupation.

Skills Required: This position requires three (3) years of experience with the following: utilizing DevOps technologies on Linux and Windows platform to build CI/CD pipelines (Continuous Integration and Continuous Development); VOIP (Voice Over Internet Protocol) technologies and associated tools including Network Analyzers, to implement Telemetry and Observability solutions; utilizing Java or Python and SHELL scripting to develop applications capable of handling high data loads using advanced programming techniques including multithreading, asynchronous processing, and optimization for performance and scalability; working on cloud platform (AWS) to migrate on-premise solutions to cloud.

This position requires any amount of experience with the following: utilizing DevOps and SRE (Service Reliability Engineering) operations on AWS Cloud Technologies to achieve enhanced system reliability, continuous integration, delivery, and efficient infrastructure management; developing and optimizing microservices using Lambda Functions with MicroNaut and integrate No-SQL databases to achieve scalable, efficient, and resilient application architectures that enhance data accessibility and system performance; VOIP Analysis (Signaling level message) using Wireshark to troubleshoot network issues; using Containers including Docker and Kubernetes for deployments for Telemetry solutions; developing observability solutions with Grafana d...




Share Job