US Jobs US Jobs     UK Jobs UK Jobs     EU Jobs EU Jobs

   

Medicare Technology Operations - Observability & Alerting Lead Analyst

Job Description

Medicare Technology Operations - Observability & Alerting Lead Analyst

Overview

The Observability & Alerting Lead Analyst is responsible for ensuring the reliability, availability, and performance of applications within the Medicare Technology Operations domain and the systems they interact with.

They work closely with the development and operations teams to build and maintain scalable and robust monitoring capabilities and dashboards that support the company's business goals.

They will be responsible for monitoring, troubleshooting, and resolving any issues that arise with these tools, as well as implementing automation and improvement initiatives to optimize system performance.

Responsibilities

The Observability & Alerting Lead Analyst will perform a critical role in driving guiding principles across the organization while partnering with development teams to improve services and their reliability.

He or she will be responsible for supporting and augmenting existing application monitoring solutions, gathering and analyzing metrics to assist in performance tuning and fault finding, and participate in system design consulting, platform management, and capacity planning where appropriate.

Job Duties


* Design and implement observability and alerting solutions across a broad array of technology platforms, including real-time and synthetic user monitoring of customer facing applications, API health and availability and microservice responsiveness.


* Collaborate with cross-functional teams to define and establish service level objectives (SLOs) and service level agreements (SLAs) for critical systems.


* Monitor systems and applications, proactively identifying and resolving any performance bottlenecks or availability issues.


* Develop and maintain monitoring tools, alerts, and dashboards to provide visibility into system health and performance.


* Create and maintain documentation for system architecture, configuration, and troubleshooting procedures.


* Assist with capacity planning and resource allocation to ensure optimal system performance and scalability.


* Collaborate with development teams to implement and deploy new features and enhancements, ensuring they meet reliability and performance standards.


* Stay up to date with industry best practices, new technologies, and emerging trends in observability engineering.

Qualifications


* Strong knowledge of Linux/Unix systems and command line tools.


* Familiarity with cloud platforms like AWS or Azure.


* Understanding of networking principles and protocols (TCP/IP, HTTP, DNS, etc.).


* Knowledge of containerization technologies (Docker, Kubernetes) and orchestration tools.


* Experience with monitoring and logging tools such as Dynatrace, Splunk, Prometheus, or Grafana.


* Strong problem-solving and troubleshooting skills, with the ability to analyze and resolve complex technical issues.


* Excellent communi...




Share Job