EPAM is a leading global provider of digital platform engineering and development services. We are committed to having a positive impact on our customers, our employees, and our communities. We embrace a dynamic and inclusive culture. Here you will collaborate with multi-national teams, contribute to a myriad of innovative projects that deliver the most creative and cutting-edge solutions, and have an opportunity to continuously learn and grow. No matter where you are located, you will join a dedicated, creative, and diverse community that will help you discover your fullest potential.
We are seeking an experienced Devops/ AIOps Architect to design, architect, and implement an AI-driven operations solution that integrates various cloud-native services across AWS, Azure, and cloud-agnostic environments. The AIOps platform will be used for end-to-end machine learning lifecycle management, automated incident detection, and root cause analysis (RCA). The architect will lead efforts in developing a scalable solution utilizing data lakes, event streaming pipelines, ChatOps integration, and model deployment services. This platform will enable real-time intelligent operations in hybrid cloud and multi-cloud setups.
Want more jobs like this?
Get jobs in Chennai, India delivered to your inbox every week.
#LI-DNI
Responsibilities
- Assist in the implementation and maintenance of cloud infrastructure and services
- Contribute to the development and deployment of automation tools for cloud operations
- Participate in monitoring and optimizing cloud resources using AIOps and MLOps techniques
- Collaborate with cross-functional teams to troubleshoot and resolve cloud infrastructure issues
- Support the design and implementation of scalable and reliable cloud architectures
- Conduct research and evaluation of new cloud technologies and tools
- Work on continuous improvement initiatives to enhance cloud operations efficiency and performance
- Document cloud infrastructure configurations, processes, and procedures
- Adhere to security best practices and compliance requirements in cloud operations
- Bachelor's Degree in Computer Science, Engineering, or related field
- 12+ years of experience in DevOps roles, AIOps, OR Cloud Architecture
- Hands-on experience with AWS services such as SageMaker, S3, Glue, Kinesis, ECS, EKS
- Strong experience with Azure services such as Azure Machine Learning, Blob Storage, Azure Event Hubs, Azure AKS
- Strong experience with Infrastructure as Code (IAC)/ Terraform/ Cloud formation
- Proficiency in container orchestration (e.g., Kubernetes) and experience with multi-cloud environments
- Experience with machine learning model training, deployment, and data management across cloud-native and cloud-agnostic environments
- Expertise in implementing ChatOps solutions using platforms like Microsoft Teams, Slack, and integrating them with AIOps automation
- Familiarity with data lake architectures, data pipelines, and inference pipelines using event-driven architectures
- Strong programming skills in Python for rule management, automation, and integration with cloud services
- Any certifications in the AI/ ML/ Gen AI space
- Opportunity to work on technical challenges that may impact across geographies
- Vast opportunities for self-development: online university, knowledge sharing opportunities globally, learning opportunities through external certifications
- Opportunity to share your ideas on international platforms
- Sponsored Tech Talks & Hackathons
- Unlimited access to LinkedIn learning solutions
- Possibility to relocate to any EPAM office for short and long-term projects
- Focused individual development
- Benefit package:
- Health benefits
- Retirement benefits
- Paid time off
- Flexible benefits
- Forums to explore beyond work passion (CSR, photography, painting, sports, etc.)