Skip to main contentA logo with &quat;the muse&quat; in dark blue text.

Senior Site Reliability Engineer

AT EPAM Systems
EPAM Systems

Senior Site Reliability Engineer

Hyderabad, India

EPAM is a leading global provider of digital platform engineering and development services. We are committed to having a positive impact on our customers, our employees, and our communities. We embrace a dynamic and inclusive culture. Here you will collaborate with multi-national teams, contribute to a myriad of innovative projects that deliver the most creative and cutting-edge solutions, and have an opportunity to continuously learn and grow. No matter where you are located, you will join a dedicated, creative, and diverse community that will help you discover your fullest potential.
We are seeking a talented and motivated Senior Site Reliability Engineer (SRE) to join our organization. The experienced Senior SRE will play a crucial role in ensuring the reliability, scalability, capacity planning, and performance of our infrastructure and applications. The ideal candidate will have a strong background in software engineering, system administration, containerization, and cloud technologies.

Want more jobs like this?

Get jobs in Hyderabad, India delivered to your inbox every week.

By signing up, you agree to our Terms of Service & Privacy Policy.


#LI-DNI#EasyApply

Responsibilities
  • Design and build scalable and reliable cloud infrastructure and services on platforms like AWS or Azure
  • Automate manual work using programming/scripting languages including Python, Bash, or PowerShell
  • Implement automation tools such as Jenkins, GitLab, and Ansible/Chef to streamline deployment, monitoring, and management of systems and applications
  • Monitor system performance proactively and troubleshoot issues to ensure high availability and performance
  • Utilize Observability tools such as Grafana, New Relic, Splunk, and Dynatrace for monitoring, alerting, and logging solutions
  • Maintain hands-on experience with containerization and orchestration technologies, including Docker and Kubernetes
  • Analyze and implement SLI, SLO, SLA, and Error Budget concepts
  • Provide on-call support and participate in incident management & response activities
Requirements
  • 5+ years of experience in a similar role
  • Knowledge of cloud platforms including AWS or Azure
  • Proficiency in scripting languages such as Python, Bash, or PowerShell
  • Background in automation tools like Jenkins, GitLab, or Ansible/Chef
  • Familiarity with Observability tools such as Grafana, New Relic, Splunk, or Dynatrace
  • Understanding of containerization and orchestration technologies like Docker and Kubernetes
  • Capability to analyze and implement SLI, SLO, SLA, and Error Budget concepts
We offer
  • Opportunity to work on technical challenges that may impact across geographies
  • Vast opportunities for self-development: online university, knowledge sharing opportunities globally, learning opportunities through external certifications
  • Opportunity to share your ideas on international platforms
  • Sponsored Tech Talks & Hackathons
  • Unlimited access to LinkedIn learning solutions
  • Possibility to relocate to any EPAM office for short and long-term projects
  • Focused individual development
  • Benefit package:
    • Health benefits
    • Retirement benefits
    • Paid time off
    • Flexible benefits
  • Forums to explore beyond work passion (CSR, photography, painting, sports, etc.)

Client-provided location(s): Hyderabad, Telangana, India
Job ID: EPAM-epamgdo_blte8e9a5fdb65b43a4_en-us_Hyderabad_India
Employment Type: Other