US Jobs US Jobs     UK Jobs UK Jobs     EU Jobs EU Jobs


Lead Infrastructure Engineer- Infrastructure Monitoring

We have an exciting opportunity for you to collaborate with passionate professionals, solve complex problems, and grow your career in a supportive, innovative environment.

As a Lead Infrastructure Engineer at JPMorgan Chase within Corporate Technology's Enterprise Observability Platforms, you will help build and operate a strategic, market-leading Infrastructure Monitoring platform that strengthens critical service resilience and delivers trusted operational insights.

You will be a hands-on technical contributor on an high-performing agile team, building secure, stable, and scalable observability solutions-turning telemetry into actionable insights, modernizing event-to-incident workflows, enabling automation and AIOps-driven reliability improvements aligned to the firm's business objectives.

Job responsibilities


* Engineer, operate, and continuously improve the firm's Infrastructure Monitoring platforms, ensuring availability, performance, scalability, and security.


* Build and run enterprise-grade Infrastructure Monitoring capabilities across Linux, Windows, and complex Network estates, including platform-level onboarding and lifecycle management.


* Design and implement platform services, integrations, and telemetry collection across metrics, logs, events, including OpenTelemetry collection patterns where applicable.


* Develop and maintain standardized onboarding patterns (agents/collectors, configurations, dashboards, alert policies) to accelerate safe adoption at scale.


* Improve monitoring signal quality and usability through baselining, threshold strategy, noise reduction, enrichment, and topology/context alignment.


* Develop secure, high-quality automation and production code; review, debug, and improve code/configuration written by others.


* Automate platform operations and reduce toil through scripting and CI/CD-driven configuration management; implement infrastructure-as-code deployment patterns


* Manage & maintain production health for the monitoring platform: lead triage, perform RCA, and deliver preventative engineering and resilience improvements.


* Partner with infrastructure, application, and SRE teams to align platform capabilities to SLIs/SLOs, operational readiness, and continuous improvement goals.


* Contribute to a culture of diversity, opportunity, inclusion, and respect.

Required qualifications, capabilities, and skills


* Formal training or certification on infrastructure engineering concepts and 5+ years applied experience


* Proficiency with enterprise operating systems (Linux and/or Windows), including administration, troubleshooting, performance analysis, and operational best practices within regulated production environments.


* Proven hands-on experience delivering and operating enterprise-scale Infrastructure Monitoring solutions across Linux, Windows, and/or Network estates


* Solid understanding and hands-on implementation of observabili...




Share Job