We are looking for a Senior Software Engineer based in Santa Clara, CA, to join the Shoreline team with experience in building highly scalable and robust enterprise software to join us. We are building and improving a powerful platform that will automate diagnosis and repair of a cluster of GPUs or CPUs across public clouds, private clouds and virtual and physical hardware. Our Shoreline team within NVIDIA enables operating GPU clusters at scale. Shoreline distributed platform helps to diagnose problems, repair, and validate GPU clusters. Our mission is to deliver a best-in-class automation platform for data center operations.
What you'll be doing:
- Designing and implementing scalable and reliable software components to enable the core platform to maintain an inventory of resources, including hosts, GPUs, and switches; to automate actions to diagnose failures and to repair
- Must be able to work on-site at Santa Clara, where you are expected to field on call two to three times a month.
- Running benchmarks and improving performance of various subsystems
- Delivering high-impact projects with high quality, performance and stability with the lowest resource consumption
- Developing a robust feedback control system that analyzes signals about system health and automatically runs commands to fix discovered issues
- Programming in modern languages like Go and Rust
Want more jobs like this?
Get jobs in Santa Clara, CA delivered to your inbox every week.
What we need to see:
- Bachelor's or Master's degree in Computer Science, Engineering, or a related field (or equivalent experience)
- 8 years of equivalent experience
- Demonstrated ability in building scalable and robust distributed systems
- Proven record of product rollouts and collaborating with early adopters
- Proficiency in programming in C/C++, Java, Rust or Go.
Ways to stand out from the crowd:
- Deep understanding of multi-threading and distributed systems concepts
- Excellent track record of delivering projects
- Expertise in optimizing SQL queries
- Expert level knowledge of Rust programming
NVIDIA is leading the way in groundbreaking developments in Artificial Intelligence, High-Performance Computing and Visualization. The GPU, our invention, serves as the visual cortex of modern computers and is at the heart of our products and services. Our work opens up new universes to explore, enables amazing creativity and discovery, and powers what were once science fiction inventions from artificial intelligence to autonomous cars. NVIDIA is looking for great people like you to help us accelerate the next wave of artificial intelligence.
NVIDIA is widely considered to be one of the technology world's most desirable employers. We have some of the most forward-thinking and dedicated people in the world working for us. If you're creative and passionate about developing services to manage a cluster of GPUs/CPUs we want to hear from you!
#LI-Hybrid
The base salary range is 180,000 USD - 276,000 USD. Your base salary will be determined based on your location, experience, and the pay of employees in similar positions.
You will also be eligible for equity and benefits. NVIDIA accepts applications on an ongoing basis.
NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.