Skip to main contentA logo with &quat;the muse&quat; in dark blue text.

Systems Architect (DevOps & MLOPs)

AT EPAM Systems
EPAM Systems

Systems Architect (DevOps & MLOPs)

Hyderabad, India

EPAM is a leading global provider of digital platform engineering and development services. We are committed to having a positive impact on our customers, our employees, and our communities. We embrace a dynamic and inclusive culture. Here you will collaborate with multi-national teams, contribute to a myriad of innovative projects that deliver the most creative and cutting-edge solutions, and have an opportunity to continuously learn and grow. No matter where you are located, you will join a dedicated, creative, and diverse community that will help you discover your fullest potential.
We are seeking a Systems Architect (DevOps & MLOPs) to join our team. In this role, you will be an integral part of our solution architecture team, focusing on designing and implementing data infrastructure and DevOps solutions for our projects. If you have a passion for data architecture and DevOps, and a strong foundation in systems architecture, MLOps, MLflow, Apache Airflow, Kubeflow, CI/CD, infrastructure, Prometheus, and Grafana, we encourage you to apply.

Want more jobs like this?

Get jobs in Hyderabad, India delivered to your inbox every week.

By signing up, you agree to our Terms of Service & Privacy Policy.


#LI-DNI

Responsibilities
  • Shorten development cycles for our software and AI/Client systems
  • Build and maintain tools and infrastructure for efficient software and AI/Client development
  • Help us build and automate our AI/Client workstream from data analysis, experimentation, operationalization, model training, model tuning to visualization
  • Build and maintain data pipelines for analytics, model evaluation and training (includes versioning, compliance, and validation)
  • Train and Re-train systems when necessary
  • Improve and maintain the automated CI/CD pipeline
  • Increase the deployment velocity, including the process for deploying models and data pipelines into production
  • Build and maintain infrastructure as code (IaC) in the cloud, that can scale when needed
  • Collaborate with engineering team to develop, deploy and maintain our products with ease
  • Ability to take on substantial responsibilities from the first day and the opportunity to directly shape our full CI/CD+ infrastructure
Requirements
  • 10 + years of strong experience in MLOPs
  • Solid experience in designing, building DevOps pipelines for ML models/apps and different environments required for Dev/Test/UAT/Prod
  • Experience in providing quick solutions/fixes to the production issues on ML models
  • Experience with Monitoring/Alerting systems for model failures, data failures, real time scoring failures etc
  • Experience in selecting appropriate datasets and data representation methods
  • Able to Perform statistical analysis and fine-tuning using test results after running the machine learning tests and experiments
  • Proven programming skills with multiple programming languages: Python/Java or similar
  • Shell scripting and Unix OS skills are necessary
  • Solid experience with Software engineering good practices, especially DevOps practices
  • Excellent problem solving and debugging skills
  • Strong experience with Cloud infrastructure on any of GCP/AWS/Azure cloud platforms. Experience in GCP is preferrable
  • Familiar with tools such as Apache Airflow, DVC, MLFlow etc
  • Certifications Preferred: Certification from any of the three major cloud platforms (AWS / Azure / GCP) in Cloud DevOps / Architecture / Engineering
Nice to have
  • Familiarity with Data Science and ML concepts
  • Computer Science graduate (BTech/BE or higher)
We offer
  • Opportunity to work on technical challenges that may impact across geographies
  • Vast opportunities for self-development: online university, knowledge sharing opportunities globally, learning opportunities through external certifications
  • Opportunity to share your ideas on international platforms
  • Sponsored Tech Talks & Hackathons
  • Unlimited access to LinkedIn learning solutions
  • Possibility to relocate to any EPAM office for short and long-term projects
  • Focused individual development
  • Benefit package:
    • Health benefits
    • Retirement benefits
    • Paid time off
    • Flexible benefits
  • Forums to explore beyond work passion (CSR, photography, painting, sports, etc.)

Client-provided location(s): Hyderabad, Telangana, India
Job ID: EPAM-epamgdo_blt0697539068161b13_en-us_Hyderabad_India
Employment Type: Other