Systems Analyst (/Site Reliability Engineer)
Systems Analyst (/Site Reliability Engineer)
This role has been designed as ‘’Onsite’ with an expectation that you will primarily work from an HPE partner/customer office.
Who We Are:
Hewlett Packard Enterprise is the global edge-to-cloud company advancing the way people live and work.
We help companies connect, protect, analyze, and act on their data and applications wherever they live, from edge to cloud, so they can turn insights into outcomes at the speed required to thrive in today’s complex world. Our culture thrives on finding new and better ways to accelerate what’s next. We know varied backgrounds are valued and succeed here.
We have the flexibility to manage our work and personal needs. We make bold moves, together, and are a force for good.
If you are looking to stretch and grow your career our culture will embrace you. Open up opportunities with HPE.
Job Description:
We are seeking a skilled Systems Analyst (/Site Reliability Engineer) at HPE to support Oak Ridge National Laboratory (ORNL).
This is a unique, on site, customer facing opportunity to work with some of the world's most advanced high-performance computing (HPC) systems, including Frontier, the world’s first exascale supercomputer.
As part of our team, you will play a critical role in the deployment, maintenance, and optimization of large-scale computing software infrastructure and hardware, ensuring system reliability for cutting-edge scientific research.
Responsibilities:
* Maintain and optimize compute infrastructure across multiple large-scale HPC systems.
* Participate in the deployment, testing, and validation of live high-performance computing clusters.
* Troubleshoot node failures by analyzing OS internals, compiler behavior, and system logs, coordinating with internal subject-matter experts as needed.
* Conduct routine and on-demand maintenance, troubleshooting, and performance tuning for large-scale HPC environments.
* Collaborate with researchers, engineers, and technical staff to open, maintain and close JIRA tickets to ensure system reliability and efficiency for high-stakes, high-performance scientific research.
* Investigate and document complex software and system-level issues, acting as a bridge between users and HPE internal teams.
* Develop and implement automation tools, scripts, and monitoring solutions to streamline system management.
* Stay up-to-date with advancements in HPC technologies, including GPU acceleration (e.g., ROCm), parallel computation (Cray PE, MPI/OpenMP), and performance tuning.
Requirements:
* Due to the nature of the work, this position requires either U.S.
Citizenship or U.S.
Lawful Permanent Resident (LPR) status.
* System Experience: Experience using SLURM-based HPC systems, both as a user and preferably as a system administrator.
* Technical Skills: Proficient in Linux, Python, and Bash scripting.
Familiarity with C++/Fortran-based H...
- Rate: Not Specified
- Location: Clinton, US-TN
- Type: Permanent
- Industry: IT
- Recruiter: Hewlett Packard Enterprise Company
- Contact: Not Specified
- Email: to view click here
- Reference: 1191043
- Posted: 2025-11-05 07:51:10 -
- View all Jobs from Hewlett Packard Enterprise Company
More Jobs from Hewlett Packard Enterprise Company
- Licensed Practical Nurse - LPN
- CNA - Weekend Option
- Cook
- Registered Nurse RN
- Registered Nurse - RN
- Assistant Director Nursing RN
- Licensed Practical Nurse LPN
- Registered Nurse RN
- Licensed Practical Nurse LPN
- CNA Weekend Option
- Certified Nurse Aide CNA
- Activities Director Assisted Living
- Admissions Assistant
- Float Certified Nursing Assistant (CNA)- $24/Hour New Castle, IN
- Bus Driver
- Room Service Coordinator
- Community Health Worker - Diabetes
- Weekend Nurse Supervisor RN/LPN
- Registered Nurse (RN) - Weekend Option
- Memory Care Director