Systems Engineer - Cloud Ops
Job Description
As a Systems Engineer on the Cloud Operations team, you will be responsible for deploying, managing, and optimizing our cloud-based infrastructure on Google Cloud Platform (GCP).
You will work with technologies such as Terraform, Kubernetes (GKE), GitOps/ArgoCD, CI/CD pipelines, and observability tools to ensure reliable, secure, and scalable platform operations.
You will also contribute to our AI/ML platform initiatives, supporting infrastructure for LLM-based applications and AI-powered automation tools that enhance developer productivity and operational efficiency.
You will collaborate with development teams, SREs, and platform architects to ensure seamless deployment and delivery of applications while maintaining the highest standards of reliability, security, and performance.
Responsibilities
Cloud Infrastructure, Automation & Operations:
* Design, build, and maintain cloud infrastructure using Terraform to automate provisioning, scaling, and lifecycle management of resources on GCP
* Develop and maintain CI/CD pipelines using GitLab CI to automate build, test, and deployment workflows.
Implement and maintain GitOps practices using ArgoCD for declarative, version-controlled application deployment
* Monitor system performance using observability tools (Dynatrace, Cloud Monitoring, Prometheus/Grafana) and troubleshoot production issues
* Participate in on-call rotation to provide 24/7 support for critical infrastructure incidents
* Perform root cause analysis on incidents and implement preventive measures.
Document runbooks, architecture decisions, and operational procedures
Kubernetes Platform Management:
* Deploy, configure, and manage containerized applications on Google Kubernetes Engine (GKE), including GKE Autopilot and Standard clusters
Manage cluster lifecycle including upgrades, node pool configurations, and capacity planning
* Troubleshoot pod failures, CrashLoopBackOff, OOMKilled events, and container resource issues
* Configure and optimize resource requests/limits, Horizontal Pod Autoscaler (HPA), and Vertical Pod Autoscaler (VPA)
* Manage Kubernetes networking including Services, Ingress controllers, Network Policies, and DNS configurations.
Implement and manage service mesh (Istio) for traffic management, observability, and security
* Manage secrets and configurations using Kubernetes Secrets, ConfigMaps, and external secret management tools.
Implement pod security standards, RBAC policies, and workload identity configurations
AI/ML Platform & Automation:
* Support infrastructure for AI/ML workloads including LLM-based applications and model serving platforms
* Deploy and manage AI-powered developer tools such as coding assistants (Claude Code, GitHub Copilot) and agentic AI systems.
Explore and implement AI-assisted incident response and automated remediation workflows
* Build and maintain infrastructure for Retrieval-Augmented Gen...
- Rate: Not Specified
- Location: Memphis, US-TN
- Type: Permanent
- Industry: Finance
- Recruiter: Autozone
- Contact: Not Specified
- Email: to view click here
- Reference: 105932
- Posted: 2026-06-02 08:02:20 -
- View all Jobs from Autozone
More Jobs from Autozone
- Grocery Clerk
- FRONT END/ASST DEPT LEADER
- DELI/ASST DEPT LEADER
- Barista - Roast Coffee House - D (Atlanta Airport)
- DELI/CLERK
- WINE STEWARD
- WINE STEWARD
- Courtesy Clerk/Grocery Bagger
- PHARMACY/PHARMCST-RELIEF (FLOAT)
- Bakery/Deli Clerk
- Behavior Health Technician
- Supported Living Specialist (Tues - Sat: 11 am - 7 pm)
- Direct Support Professional - (Thurs, Fri, Sat: 8 am - 8 pm)
- Direct Support Professional - (Sun, Mon, Wed: 7 am - 7 pm)
- Direct Support Professional - (Mon, Tues, Wed: 7 am - 7 pm)
- Borrower Account Services Representative II
- FRONT END/LEAD CLERK (CSR)
- Part Time Associate Banker West Nashville (30 Hours)
- Part Time Associate Banker Chapel Hill/Durham (30 Hours)
- Bakery/Deli Clerk