We are currently searching for a Lead HPC Engineer to manage the daily operations and engineering tasks within our HPC framework.
The perfect candidate should be a proficient engineer with vast experience in setting up and enhancing HPC infrastructure. The role entails collaborating with our Level-3 HPC infrastructure engineering team to facilitate the utilization of an HPC cluster by our Scientific research team. Although preference will be for candidates from India, the role is accessible to candidates from any geographical location.
#LI-DNI
Responsibilities
- Maintenance and support of the HPC infrastructure
- Employing Infrastructure as Code (IaC) for infrastructure automation
- Incident resolution and involvement in software and hardware upgrades
- Administration of job scheduling and resource management using HPC job schedulers
- Installation and configuration of Bright Cluster Manager
- Optimization and maintenance of GPFS/Lustre file systems
- Supervision of configurations for InfiniBand/OmniPath network interconnects
Want more jobs like this?
Get jobs in Barra do Garças, Brazil delivered to your inbox every week.
- Minimum of 7 years as an HPC technical expert
- Knowledge in engineering or HPC system development
- Expertise in supporting and setting up HPC infrastructure
- Proficiency in Linux (any rpm-based) including compiling kernel modules, and using debugging tools like strace, coredump, and tcpdump
- Background in managing HPC job schedulers such as IBM LSF and Slurm
- Qualifications in configuring and implementing Bright Cluster Manager
- Understanding of both GPFS and Lustre file systems
- Familiarity with InfiniBand and OmniPath network interconnect technologies
- Proficiency in hardware diagnostics, upgrades, and tuning including HCA InfiniBand and disk arrays from Lustre, Vast, IBM
- Capability to use infrastructure monitoring tools like Zabbix, Splunk, or Grafana
- Understanding of Easybuild
- Background in working within a GxP environment
- Familiarity with Jira and ServiceNow