US Jobs US Jobs     UK Jobs UK Jobs     EU Jobs EU Jobs


Senior Lead Site Reliability Engineer

Elevate your engineering prowess to unprecedented levels by joining a team of exceptionally gifted professionals and position yourself among the top echelon in site reliability.

As a Senior Lead Site Reliability Engineer at JPMorgan Chase within the within Corporate Technology ,Compliance Technology team, you work with your fellow stakeholders to define non-functional requirements (NFRs) and availability targets for the services in your application and product lines.

You will ensure those NFRs are accounted for in your products' design and test phases, that your service level indicators are effectively measuring customer experience, and that service level objectives are defined with stakeholders and implemented in production.

Job Responsibilities



* Creates and delivers high-quality designs, roadmaps, and program charters, while designing and developing robust software solutions, CI/CD pipelines, and infrastructure automation to optimize system reliability, scalability, and performance


* Acts as a key resource and mentor for technologists, fostering a culture of site reliability, inclusion, and engineering excellence while guiding teams on best practices across cloud infrastructure, automation, and operational readiness


* Collaborates with stakeholders to design and implement observability, alerting, and reliability solutions, including SLOs/SLIs, monitoring frameworks, and incident response processes that ensure stable, scalable, and high-performing systems


* Uses enterprise-authorized AI capabilities within the work environment to accelerate reliability design and operational decisioning (e.g., incident/post-incident analysis and requirements traceability), validating outputs and handling operational data according to sensitivity and security requirements, while also leveraging modern tooling to optimize CI/CD and operational workflows.


* Drives evolution, debugging, and performance optimization of critical systems by managing cloud-native infrastructure (AWS), container platforms (Docker/Kubernetes/EKS/ECS), and understanding application dependencies and system limitations


* Provides ongoing guidance, tools, and automated solutions including infrastructure as code (Terraform/CloudFormation/CDK), environment standardization, configuration management, patching, backups, and cost optimization strategies


* Makes significant contributions to JPMorganChase's SRE community while supporting release management, change control, on-call rotations, and continuous improvement through post-incident reviews and operational excellence practices


* Leads reuse-first adoption of AI-assisted reliability workflows across SDLC/toolchain practices (e.g., testing/validation automation and production readiness), ensuring traceability/auditability, resiliency, and security controls, while enforcing governance, security best practices (IAM, secrets management), and reliability-focused automation.

Required qualificatio...




Share Job