NVIDIA is looking for an outstanding engineer to join its Software Infrastructure and Operations team. The position will be part of a fast-paced crew that develops and maintains sophisticated Kubernetes based development, build and test environments for a multitude of platforms including Windows and Linux. Are you passionate about infrastructure and looking for complicated problems, ready to build the next generation of cloud services, craft innovative solutions, mine through data to uncover real problems and fix them? We are delighted to have a fun-loving person like you !
What you'll be doing:
- Architect the scaling operation in our data centers. Deploy and Support end-to-end container management solution with Kubernetes, Docker, Containerd.
- Design solutions with service discovery, networking, monitoring, logging, scheduling in Kubernetes
- You will be working on challenging problems in area of infrastructure such as job scheduling, resource management and automated recovery.
- Use your depth in algorithms and system software background!
- Work in teams to deploy new data center infrastructure.
- Plan and implement critical metrics tracking using various data analytics mining methods and dashboards.
- Reuse AI techniques to extract useful signals about machines and jobs from the data generated!
- Take part in prototyping, crafting and developing cloud infrastructure for Nvidia.
- Develop various device plugins / Operator on Kubernetes
- Build complete solutions including Metrics, Alert and Storage Services
- You want to dig more data, analyze much more, apply deep learning algorithms / machine learn to improve the performance/predictability of the system
Want more jobs like this?
Get jobs in Pune, India delivered to your inbox every week.
What we need to see:
- Strong object-oriented programming background in python/Golang/java and/or relevant scripting languages
- Background in developing large scale cloud infrastructure applications
- Knowledge of various technologies (Kubernetes, Message broker)
- Experience with Relational Databases such as MySQL, NoSQL DBs such as Elastic Search
- Proficient with configuration management tools like Ansible, Chef, Puppet and strong experience with Jenkins and/or other CI systems.
- Ability to collaborate across multiple team and across people working in different time zones.
- Experience with analytics/visualization tools like Kibana, Grafana, Splunk etc. and experience with monitoring systems such as Zabbix and/or Nagios is nice to have
- BS/MS in Computer Science or Computer Engineering or equivalent experience
- 5+ years of proven experience.
Ways to stand out from the crowd:
- Real world experience with distributed systems, containers, and Kubernetes API.
- Previous experience with DevOps teams
- You have worked on computer algorithms and demonstrated ability to choose the best possible algorithms to nail sophisticated problems
- Able to divide sophisticated problems into simple sub problems and then reuse available solutions to implement the solutions.
- Experience in design, implementation and deployment of major infrastructure features across multiple servers in incremental rollout mode
NVIDIA is widely considered to be one of the technology world's most desirable employers. We have some of the most forward-thinking and dedicated individuals in the world working for us. If you're creative and passionate about developing cloud services we want to hear from you!
We are an equal opportunity employer and value diversity at our company. We do not discriminate on the basis of race, religion, color, national origin, sex, gender, gender expression, sexual orientation, age, marital status, veteran status, or disability status. We will ensure that individuals with disabilities are provided reasonable accommodation to participate in the job application or interview process, to perform essential job functions, and to receive other benefits and privileges of employment. Please contact us to request accommodation.