EPAM is a leading global provider of digital platform engineering and development services. We are committed to having a positive impact on our customers, our employees, and our communities. We embrace a dynamic and inclusive culture. Here you will collaborate with multi-national teams, contribute to a myriad of innovative projects that deliver the most creative and cutting-edge solutions, and have an opportunity to continuously learn and grow. No matter where you are located, you will join a dedicated, creative, and diverse community that will help you discover your fullest potential.
We are seeking a talented and motivated Senior Site Reliability Engineer (SRE) to join our organization. The experienced Senior SRE will play a crucial role in ensuring the reliability, scalability, capacity planning, and performance of our infrastructure and applications. The ideal candidate will have a strong background in software engineering, system administration, containerization, and cloud technologies.
Want more jobs like this?
Get jobs in Hyderabad, India delivered to your inbox every week.
#LI-DNI#EasyApply
Responsibilities
- Design and build scalable and reliable cloud infrastructure and services on platforms like AWS or Azure
- Automate manual work using programming/scripting languages including Python, Bash, or PowerShell
- Implement automation tools such as Jenkins, GitLab, and Ansible/Chef to streamline deployment, monitoring, and management of systems and applications
- Monitor system performance proactively and troubleshoot issues to ensure high availability and performance
- Utilize Observability tools such as Grafana, New Relic, Splunk, and Dynatrace for monitoring, alerting, and logging solutions
- Maintain hands-on experience with containerization and orchestration technologies, including Docker and Kubernetes
- Analyze and implement SLI, SLO, SLA, and Error Budget concepts
- Provide on-call support and participate in incident management & response activities
- 5+ years of experience in a similar role
- Knowledge of cloud platforms including AWS or Azure
- Proficiency in scripting languages such as Python, Bash, or PowerShell
- Background in automation tools like Jenkins, GitLab, or Ansible/Chef
- Familiarity with Observability tools such as Grafana, New Relic, Splunk, or Dynatrace
- Understanding of containerization and orchestration technologies like Docker and Kubernetes
- Capability to analyze and implement SLI, SLO, SLA, and Error Budget concepts
- Opportunity to work on technical challenges that may impact across geographies
- Vast opportunities for self-development: online university, knowledge sharing opportunities globally, learning opportunities through external certifications
- Opportunity to share your ideas on international platforms
- Sponsored Tech Talks & Hackathons
- Unlimited access to LinkedIn learning solutions
- Possibility to relocate to any EPAM office for short and long-term projects
- Focused individual development
- Benefit package:
- Health benefits
- Retirement benefits
- Paid time off
- Flexible benefits
- Forums to explore beyond work passion (CSR, photography, painting, sports, etc.)