Lead Software Engineer- Resiliency
Be an integral part of an agile team that's constantly pushing the envelope to enhance, build, and deliver top-notch technology products.
As a Lead Site Reliability Engineer at JPMorgan Chase within the Employee Compute Branch Team you will play a pivotal role in designing, implementing, and overseeing automation for observability and notification across a diverse set of systems in a global Microsoft Windows environment.
You will lead by example, bringing hands-on expertise in PowerShell and C#, and infusing best practices into a team of highly experienced system engineers.
Your work will directly impact the reliability, scalability, and efficiency of our platforms, with a strong focus on cloud (Azure and AWS) integration.
Job Responsibilities
* Champion site reliability engineering culture and practices, exerting technical influence across the team.
* Lead the design and hands-on implementation of automated observability and notification solutions using PowerShell and C#.
* Drive initiatives to improve reliability and stability of applications and platforms through data-driven analytics and automation.
* Collaborate with team members to define and implement service level indicators, objectives, and error budgets.
* Architect and implement monitoring, alerting, and telemetry solutions using tools such as Grafana, Dynatrace, Prometheus, Datadog, and Splunk.
* Act as the primary technical lead during major incidents, quickly identifying and resolving issues to minimize impact.
* Mentor and upskill system engineers, fostering a programming mindset and best practices in automation and reliability.
* Facilitate cross-team and cross-region collaboration, ensuring alignment and knowledge sharing.
* Document and share technical solutions and best practices within internal forums and communities of practice.
* Engage with stakeholders to understand business needs and translate them into technical solutions, with increasing responsibility over time.
* Break down complex problems into actionable work for the team, ensuring clear direction and accountability.
Required qualifications, capabilities, and skills
* Formal training or certification on Site Reliability Engineering concepts and 5+ years applied experience
* Deep proficiency in reliability, scalability, performance, security, enterprise system architecture, and toil reduction, with proven ability to implement these practices.
* Expert-level fluency in PowerShell and C# in a Microsoft Windows environment.
* Hands-on experience with cloud platforms, specifically Azure and AWS.
* Demonstrated experience in automated software testing (unit, integration, end-to-end).
* Deep knowledge of software applications and technical processes, with emerging depth in one or more technical disciplines.
* Proficiency and experience in observability, including white and black box monitoring, SLO alerting, and telemetry colle...
- Rate: Not Specified
- Location: Columbus, US-OH
- Type: Permanent
- Industry: Finance
- Recruiter: JPMorgan Chase Bank, N.A.
- Contact: Not Specified
- Email: to view click here
- Reference: 210688544
- Posted: 2025-12-02 07:53:45 -
- View all Jobs from JPMorgan Chase Bank, N.A.
More Jobs from JPMorgan Chase Bank, N.A.
- Maintenance Supervisor
- Senior Technical Program Manager - Optical Solutions
- Senior Laser Engineer - Optical Solutions
- Real Estate Transaction Manager
- Shipping Operator
- Maintenance Planning Coordinator
- Process Technician
- Control Systems Project Engineer
- Sr Software Engineer
- Principal Product Design Engineer - Enterprise Copper Solutions
- Principal Product Design Engineer - Enterprise Copper Solutions
- Technical Project Manager
- Project Engineer
- Field Application Engineer - Optical Solutions
- Field Application Engineer - Optical Solutions
- Legal and Compliance Internship - Summer 2026
- Legal and Compliance Internship - Summer 2026
- Controls and Automation Engineer (Albany, OR)
- System Support Analyst
- Concierge