About the Team:

Senior Site Reliability Engineers at UKG are team members that have a breadth of knowledge encompassing all aspects of service delivery. They develop software solutions to enhance, harden and support our service delivery processes. This can include building and managing CI/CD deployment pipelines, automated testing, capacity planning, performance analysis, monitoring, alerting, chaos engineering and auto remediation.

About the Role:

Senior Site Reliability Engineers must have a passion for learning and evolving with current technology trends. They strive to innovate and are relentless in their pursuit of a flawless customer experience. They have an automate everything mindset, helping us bring value to our customers by deploying services with incredible speed, consistency and availability.

Want more jobs like this?

Get jobs delivered to your inbox every week.

By signing up, you agree to our Terms of Service & Privacy Policy.

• Engage in and improve the lifecycle of services from conception to EOL, including: system design consulting, and capacity planning

• Define and implement standards and best practices related to: System Architecture, Service delivery, metrics and the automation of operational tasks

• Support services, product & engineering teams by providing common tooling and frameworks to deliver increased availability and improved incident response.

• Improve system performance, application delivery and efficiency through automation, process refinement, postmortem reviews, and in-depth configuration analysis

• Collaborate closely with engineering professionals within the organization to deliver reliable services

• Identify and eliminate operational toil by treating operational challenges as a software engineering problem

• Actively participate in incident response, including on-call responsibilities

• Requirement for on call

About You:

Basic Qualifications:

• 3-5+ years of hands-on experience working in Engineering or Cloud

• 3-5+ years of experience with public cloud platforms (e.g. GCP, AWS, Azure)

• Engineering degree, or a related technical discipline, or equivalent work experience

• Experience coding in higher-level languages (e.g., Python, JavaScript, C++, or Java)

• Demonstrated understanding of best practices in metric generation and collection, log aggregation pipelines, time-series databases, and distributed tracing

• Demonstrable fundamentals in 2 of the following: Computer Science, Cloud Architecture, Security, or Network Design fundamentals

• Working experience with industry standards like Terraform, Ansible, Kubernetes, DataDog

• Experience working with automation

Preferred Qualifications:

• Experience with distributed system design and architecture

• Experience with containerization technologies

• Experience in configuration and maintenance of applications and/or systems infrastructure for large scale customer facing company

Senior Site Reliability Engineer

Want more jobs like this?

Perks and Benefits

Health and Wellness

Parental Benefits

Work Flexibility

Office Life and Perks

Vacation and Time Off

Financial and Retirement

Professional Development

Search Additional Jobs