US Jobs US Jobs     UK Jobs UK Jobs     EU Jobs EU Jobs

   

Lead Site Reliability Engineer

Assume a critical role in defining the future of a globally recognized firm and have a direct and significant effect in a realm tailored for top achievers in site reliability.

As a Manager in Site Reliability Engineering at JPMorgan Chase within the Technology Infrastructure Services, you will lead a team of talented engineers to drive innovation and modernize the world's most complex and mission-critical systems.

You will be responsible for managing a variety of production-related issues, ensuring the reliability and scalability of applications, and fostering a culture of continuous improvement and collaboration.

Job responsibilities


* Lead and manage a team of Site Reliability Engineers, providing guidance, mentorship, and support to drive team success.


* Develop and implement strategies to enhance the reliability, availability, and scalability of applications and platforms.


* Collaborate with cross-functional teams, including software engineers, product managers, and stakeholders, to align SRE practices with business objectives.


* Oversee the design and implementation of automated continuous integration and continuous delivery pipelines.


* Manage and resolve complex production-related issues, ensuring minimal impact on customers and business operations.


* Foster a culture of site reliability engineering best practices, promoting proactive issue resolution and continuous improvement.


* Communicate effectively with stakeholders, providing updates on system performance, incidents, and improvement initiatives.


* Identify and implement new technologies and solutions to enhance the SRE function and support business growth.

Required qualifications, capabilities, and skills


* 5+ years of experience in site reliability engineering or a related field, with a proven track record of managing and leading technical teams.


* Strong understanding of site reliability engineering principles and practices, with the ability to implement them within an organization.


* Excellent communication and interpersonal skills, with the ability to engage and influence stakeholders at all levels.


* Proficient in at least one programming language such as Python, Java/Spring Boot, or .Net.


* Experience with observability tools such as Grafana, Dynatrace, Prometheus, Datadog, Splunk, and others.


* Familiarity with continuous integration and continuous delivery tools like Jenkins, GitLab, or Terraform.


* Experience with container and container orchestration technologies such as ECS, Kubernetes, and Docker.


* Strong problem-solving skills, with the ability to manage and resolve complex production-related issues.


* Ability to identify and implement new technologies and solutions to support business objectives and drive innovation.


* Ability to expand and collaborate across different levels and stakeholder groups

Preferred qualifications, capabilities, and skills


* Formal training o...




Share Job