US Jobs US Jobs     UK Jobs UK Jobs     EU Jobs EU Jobs

   

Senior Cloud Reliability Engineer

Company

Federal Reserve Bank of Richmond

The Richmond Fed is the proud home of the Federal Reserve’s National IT organization—a nationwide team delivering technology solutions and support across the Federal Reserve System.

Many National IT employees are located in Richmond, while others are based across the U.S.

at other Federal locations.

When you join our team, you’ll become part of a culture that welcomes differences, cares about our communities, and empowers each other to lead from where we are to make things better.

Bring your passion and we’ll provide challenging and purposeful careers in a variety of fields, opportunities to grow and a wide range of benefits and perks that support your health and wealth.

It’s all part of what makes #MyRichmondFed a great place to work!

About the Opportunity

As a Senior Cloud Reliability Engineer in the SRE chapter, you will be accountable for implementing reliability practices with software as means for the cloud foundational product line in the Federal Reserve.

The SRE Chapter is part of the Cloud Solutions & Services department and has the overall responsibility for reliability of the numerous cloud foundational environments in the FRS.
 

What Will Be Expected of You


* Works part of cloud foundational platform squads to demonstrate and champion site reliability culture and practices and exerts technical influence throughout your team


* Solves reliability of cloud platforms with software engineering principles


* Develops and maintains automations, scripts and code associated with automating manual work, improving reliability and stability of the cloud platform


* Develops, integrates and maintains synthetics (canaries) code to establish health of the platform


* Leads SLIs, SLOs, Error budgets efforts in collaboration with product team to instrument, visualize for proactively managing the stability of cloud platforms


* Implement observability (logs, metrics, traces) and monitoring for cloud foundational platforms


* Defines chaos experiments in collaboration with product owners and conducts experiments


* Develops reusable artifacts and software utilities to industrialize SRE practices across FRS


* Provides consulting services across the system to implement SRE


* Develops and Mentors Junior engineers in the team


* Other duties assigned as necessary

Qualifications


* 5-7 years of extensive experience in end-to-end enterprise software development life cycle including maintenance and support


* 3+ years of experience in Observability and SRE practices.


* Bachelor’s degree in computer science, Information Systems, or equivalent background or equivalent experience.


* Extensive knowledge and experience of working in AWS environments


* Strong Software development experience in Cloud with one of the languages: Python or GoLang


* Experience with observability, open telemetry, and in one or more of the tools like ...




Share Job