We are seeking a Senior Site Reliability Engineer (SRE) to join our team. The ideal candidate will bring hands-on experience with automation, infrastructure management, and cloud technologies. This role involves ensuring the reliability, performance, and scalability of our systems through effective infrastructure management and problem-solving. The Senior SRE will also work closely with engineering teams to build and maintain our cloud-based infrastructure, ensuring efficient, cost-effective, and automated solutions.
EPAM is a leading global provider of digital platform engineering and development services. We are committed to having a positive impact on our customers, our employees, and our communities. We embrace a dynamic and inclusive culture. Here you will collaborate with multi-national teams, contribute to a myriad of innovative projects that deliver the most creative and cutting-edge solutions, and have an opportunity to continuously learn and grow. No matter where you are located, you will join a dedicated, creative, and diverse community that will help you discover your fullest potential.
Want more jobs like this?
Get jobs in Bahía Blanca, Argentina delivered to your inbox every week.
#LI-DNI
Responsibilities
- Automate infrastructure provisioning and management using Terraform and other configuration management tools
- Collaborate with cross-functional teams to design, implement, and optimize scalable systems
- Support the operational stability of the infrastructure, including participating in on-call rotations
- Ensure high availability and reliability of cloud-based services and applications
- Continuously monitor and improve system performance using tools like CloudWatch and New Relic
- Troubleshoot and resolve infrastructure and system issues in a timely manner
- Maintain and improve CI/CD pipelines using Jenkins, CircleCI, and GitHub Actions
- Provide support for infrastructure-as-code practices, including Terraform and CloudFormation
- 3+ years of relevant working experience
- Strong hands-on experience with Terraform and configuration management tools
- Proficiency in Python for scripting and automation
- Experience with cloud platforms, especially AWS
- Familiarity with container orchestration technologies such as ECS, EKS, K8S, and Docker
- Solid understanding of RDBMS (MySQL, PostgreSQL, Oracle) and NoSQL databases (Couchbase, DynamoDB)
- Experience with telemetry tools like New Relic and CloudWatch
- Strong knowledge of CI/CD tools such as Jenkins, CircleCI, and GitHub Actions
- B2+ English level (effective communication, both written and verbal)
- Ability to participate in on-call rotations for operational duties
- Strong self-motivation and the ability to work independently
- Excellent problem-solving and troubleshooting skills
- Background in software development (Java, PHP, Node, GoLang)
- Connectivity Bonus (15,000 ARS are paid with a salary receipt at the end of each month as a non-wages concept)
- Medicina Prepaga (It covers the collaborator and direct family group)
- Paternity Leave (Two additional days are added to what is established by law, total of 4 days)
- Discounts card
- English Training (English lessons, twice per week)
- Training Program (Access to multiple customized training plans according to the needs of each role within the company)
- Marriage bonus (The company doubles the allowance established by law that ANSES offers)
- Referral Program (Referral bonus is paid when the referral of a collaborator joins the Company)
- External Agreements and Discounts
- Vacations: 14 calendar days a year