US Jobs US Jobs     UK Jobs UK Jobs     EU Jobs EU Jobs


Sr Lead Infrastructure Engineer - Infrastructure Monitoring

We have an opportunity to impact your career and provide an adventure where you can push the limits of what's possible.

As a Sr Lead Infrastructure Engineer-Infrastructure Monitoring at JPMorgan Chase within the Corporate Technology Enterprise Observability Platforms team , you will lead the modernization of Infrastructure monitoring into a strategic, secure, scalable, and automation-enabled observability platform-strengthening firmwide resilience and delivering trusted operational insights.

You will be a hands-on technical contributor who drives adoption and partners across infrastructure, application, and SRE teams to improve telemetry collection and signal quality, modernize event-to-incident workflows, and enable AIOps-driven reliability improvements aligned to business objectives.

Job responsibilities


* Lead the modernization of the infrastructure monitoring platform, defining target-state architecture and roadmap while balancing near-term delivery with long-term resiliency, scalability, security, and usability goals


* Engineer, operate, and continuously improve enterprise monitoring platforms to meet availability, performance, scale, and security requirements.
Own platform design and architecture for telemetry collection and integration across metrics, logs, events, and traces, including OpenTelemetry patterns where applicable


* Drive large-scale enterprise onboarding across Linux, Windows, and complex network estates, including lifecycle management, versioning/upgrade strategies, and governance controls


* Standardize onboarding patterns (agents/collectors, configuration baselines, dashboards, alerting, metadata, and runbooks) to enable safe, repeatable adoption


* Improve signal quality and actionability through baselining, threshold strategy, noise reduction, enrichment, and topology/context alignment to reduce MTTR and operational overhead


* Develop and maintain production-grade automation, services, and configuration-as-code; establish engineering standards and conduct rigorous reviews for reliability, security, and maintainability


* Reduce operational toil through automation and CI/CD-driven configuration management, including infrastructure-as-code patterns (e.g., Terraform).
Lead production health and operational excellence for the monitoring platform, including incident triage, root-cause analysis, and corrective/preventative actions


* Partner with infrastructure, application, and SRE teams to align platform capabilities to SLIs/SLOs, operational readiness, and continuous improvement objectives


* Advance AIOps capabilities (e.g., correlation, anomaly detection, guided remediation) through experimentation, proofs of concept, and governed rollouts, while mentoring junior engineers and fostering a strong engineering culture

Required qualifications, capabilities, and skills


* Formal training or certification on infrastructure engineering concepts and 5+ years applied ...




Share Job