Skip to main contentA logo with &quat;the muse&quat; in dark blue text.

Senior Architect, Large Scale Distributed Training

AT NVIDIA
NVIDIA

Senior Architect, Large Scale Distributed Training

Yokneam, Israel

As NVIDIANs, we see problems that other people have never encountered. However, we also have capabilities and access to technology that few people have. Our group leads, invents, evangelizes, creates architecture, and educates on using novel technologies. Do you want to lead the industry, too? We are seeking an Architect or a talented SW Engineer who wishes to move into an architect position. A person with a good system understanding and fully hands-on skills who want to be part of the technology seeding phase. In this position, you will invent, run proof of concepts, and write specifications for the engineering groups. Come and help us lead the next-generation data center technology!

What you'll be doing:

  • Learn our architecture with a focus on the technology that we drive
  • Optimize AI/ML model training time at large scale
  • Code and build proof-of-concept prototypes.
  • Design and define protocols and APIs for leveraging our technology in a data center
  • Research and evaluate algorithms currently used in related applications
  • Participate in defining hardware and system features, and assist software and hardware groups in enabling new technologies.

Want more jobs like this?

Get Software Engineering jobs in Yokneam, Israel delivered to your inbox every week.

By signing up, you agree to our Terms of Service & Privacy Policy.

What we need to see:

  • B.Sc./M.Sc. or equivalent experience in Electrical Engineering or Computer Science from a leading university
  • 3-5 years of proven experience in the industry, specifically in SW engineering, distributed AI system training
  • Familiarity with networking concepts, terms, and software stack
  • Passion for problem-solving and algorithms research and development
  • Background in distributed AI/ML models training on GPU's clusters

Ways to stand out from the crowd:

  • Background in data center architecture
  • Experience with Collective Communications Library such as NCCL
  • good understanding of OS, driver and performance aspects of a system
  • Background in network synchronization protocols such as IEEE 1588 PTP
  • Good command of Python, C/C++

NVIDIA is committed to fostering a diverse work environment and is proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) based on race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law. We will ensure that individuals with disabilities are provided reasonable accommodation to participate in the job application or interview process, to perform essential job functions, and to receive other benefits and privileges of employment. Please contact us to request an accommodation.

Client-provided location(s): Yokne'am Illit, Israel
Job ID: NVIDIA-JR1968653
Employment Type: Full Time