Skip to main contentA logo with &quat;the muse&quat; in dark blue text.

Deep Learning Performance Architect

AT NVIDIA
NVIDIA

Deep Learning Performance Architect

Beijing, China

We are now looking for a Deep Learning Architect!

Are you passionate about exploring computer architectures for AI? Do you like to build the industry leading product at the intersection of hardware and software? We are seeking world-class programmers and performance architects who love to squeeze out every cycle of performance from deep learning codes. In this role, you will craft and maintain a library that ships our best-performing GPU kernels to NVIDIA's industry-leading AI products. This position in our team offers the opportunity to have real impact in a fast-moving, technology-focused company.

What you'll be doing:

  • Design and develop the architecture, interface and features of the GPU kernel library
  • Keep improving the quality and performance of the library and its GPU kernels
  • Explore and expand the boundary of innovative technologies like GPU code generation and fusion
  • Contribute to NVIDIA's AI business by collaborating closely with DL product teams as well as kernel development teams

Want more jobs like this?

Get Software Engineering jobs delivered to your inbox every week.

Select a location
By signing up, you agree to our Terms of Service & Privacy Policy.

What we need to see:

  • MS, PhD or equivalent in relevant fields (CS, EE, Math)
  • 2+ years of relevant work or research experience
  • Strong programming skills in C, C++, and Python
  • Excellent problem solving skills and learning capability
  • Experience with designing software architecture, interfaces, and building testing infrastructures
  • Good communication and a great teammate

Ways to stand out from the crowd:

  • Familiar with CUDA programming and GPU architecture
  • Familiar with TensorRT/cuDNN/cuBLAS etc.
  • Background with DL fundamentals, frameworks, graph compilers, LLVM, MLIR etc.
  • Hands-on experience in development on Linux and Windows platforms, C++ build tools like CMake and DevOps tools, including Docker, Jenkins, Kubernetes etc.
  • Track record of mentoring junior engineers and leading a project and a team

Client-provided location(s): Beijing, China; Shanghai, China
Job ID: NVIDIA-JR1975040
Employment Type: Full Time