US Jobs US Jobs     UK Jobs UK Jobs     EU Jobs EU Jobs

   

Lead Site Reliability Engineer

Assume a critical role in defining the future of a globally recognized firm and have a direct and significant effect in a realm tailored for top achievers in site reliability.

As a Lead Site Reliability Engineer at JPMorgan Chase within the Corporate Sector Enterprise Technology, AI/ML & Data Platforms, you will hold a pivotal role in your team.

Your extensive technical knowledge will be utilized to overcome both technical and business challenges.

Your duties will encompass leading resiliency design reviews, simplifying complex issues into manageable tasks for other engineers, acting as a technical lead for medium to large-sized projects, and providing guidance and mentorship to your team members.

Job responsibilities


* Champion a culture of site reliability, exerting technical influence throughout your team and the organization.


* Lead initiatives to improve service levels using data-driven analytics, enhancing the reliability and stability of applications and platforms.


* Collaborate with team members to identify comprehensive service level indicators and work with stakeholders to establish service level objectives and error budgets.


* Demonstrate high-level expertise in AWS, distributed systems, and data warehouse domains, proactively resolving technology-related bottlenecks.


* Act as the primary point of contact during major incidents, showcasing the ability to quickly identify and resolve issues to prevent financial losses.


* Document and share knowledge within the organization through internal forums and communities of practice.

Required Qualifications, Capabilities, and Skills:


* Formal training or certification in site reliability engineering concepts with 5+ years of applied experience.


* Deep proficiency in reliability, scalability, performance, security, enterprise system architecture, toil reduction, and other site reliability best practices.


* Proficiency in at least one programming language such as Python, Java, C, .Net, etc.


* Extensive knowledge of software applications and technical processes, with emerging expertise in one or more technical disciplines.


* Proficiency in observability, including white and black box monitoring, SLO alerting, and telemetry collection using tools like Grafana, Dynatrace, Prometheus, Datadog, Splunk, etc.


* Experience with continuous integration and continuous delivery tools (e.g., Jenkins, GitLab, Terraform, etc.).


* Experience with cloud computing using AWS (EC2, EMR, Athena, Glue, Redshift, etc.) and container orchestration (e.g., ECS, Kubernetes, Docker, etc.).


* Experience troubleshooting common networking technologies and issues.

Preferred Qualifications, Capabilities, and Skills:


* Ability to identify and solve problems related to complex data structures and algorithms.


* Self-motivated and a lifelong learner, eager to embrace and master emerging technologies.


* Ability to expand and collaborate across ...




Share Job