Skip to main contentA logo with &quat;the muse&quat; in dark blue text.

Senior Site Reliability Engineer

AT EPAM Systems
EPAM Systems

Senior Site Reliability Engineer

Bahía Blanca, Argentina

We are seeking a Senior Site Reliability Engineer to join our team in a remote capacity. This role offers the opportunity to work on challenging projects and contribute to the development and maintenance of highly scalable and available systems. As a Senior SRE, you will play a crucial role in ensuring the reliability, performance, and security of our infrastructure. If you are passionate about automation, observability, and troubleshooting in distributed systems, this is the perfect role for you.
EPAM is a leading global provider of digital platform engineering and development services. We are committed to having a positive impact on our customers, our employees, and our communities. We embrace a dynamic and inclusive culture. Here you will collaborate with multi-national teams, contribute to a myriad of innovative projects that deliver the most creative and cutting-edge solutions, and have an opportunity to continuously learn and grow. No matter where you are located, you will join a dedicated, creative, and diverse community that will help you discover your fullest potential.

Want more jobs like this?

Get jobs in Bahía Blanca, Argentina delivered to your inbox every week.

By signing up, you agree to our Terms of Service & Privacy Policy.


#LI-DNI#EasyApply

Responsibilities
  • Design and implement efficient and reliable automation workflows
  • Collaborate with development and operations teams to enhance CI/CD processes
  • Implement and maintain monitoring solutions for infrastructure and applications
  • Conduct performance and reliability assessments of production systems
  • Troubleshoot and resolve complex issues in distributed systems
  • Ensure high availability and scalability of production systems
  • Lead incident response and post-mortem analysis
  • Provide guidance and support to junior SRE team members
Requirements
  • 3+ years of proven experience as a Site Reliability Engineer or similar role
  • Expertise in automation using scripting and programming languages
  • Proficiency in CI/CD, Kubernetes, Datadog, and Terraform
  • Strong understanding of observability and troubleshooting in distributed systems
  • Solid programming skills in Python
  • Experience with Azure Data Factory and Azure Databricks
  • Ability to write efficient scripts for automation and monitoring
  • English B2+
We offer
  • Connectivity Bonus (15,000 ARS are paid with a salary receipt at the end of each month as a non-wages concept)
  • Medicina Prepaga (It covers the collaborator and direct family group)
  • Paternity Leave (Two additional days are added to what is established by law, total of 4 days)
  • Discounts card
  • English Training (English lessons, twice per week)
  • Training Program (Access to multiple customized training plans according to the needs of each role within the company)
  • Marriage bonus (The company doubles the allowance established by law that ANSES offers)
  • Referral Program (Referral bonus is paid when the referral of a collaborator joins the Company)
  • External Agreements and Discounts
  • Vacations: 14 calendar days a year
By applying to our role, you are agreeing that your personal data may be used as in set out in EPAM's Privacy Notice and Policy.

Client-provided location(s): Argentina
Job ID: EPAM-epamgdo_blta79f91087faab621_en-us_Other_Argentina
Employment Type: Other