Skip to main contentA logo with &quat;the muse&quat; in dark blue text.

Staff Observability Operations Engineer

AT CVS Health
CVS Health

Staff Observability Operations Engineer

Hartford, CT

Bring your heart to CVS Health. Every one of us at CVS Health shares a single, clear purpose: Bringing our heart to every moment of your health. This purpose guides our commitment to deliver enhanced human-centric health care for a rapidly changing world. Anchored in our brand - with heart at its center - our purpose sends a personal message that how we deliver our services is just as important as what we deliver.

Our Heart At Work Behaviors™ support this purpose. We want everyone who works at CVS Health to feel empowered by the role they play in transforming our culture and accelerating our ability to innovate and deliver solutions to make health care more personal, convenient and affordable.Company Overview:

CVS Health is a premier health innovation company helping people on their path to better health. We are pioneering a new approach to total health by making quality care more affordable, accessible, simple, and seamless. CVS Health is driven by a clear purpose: helping people on their path to better health. We are transforming health care by expanding access to innovative health solutions and leading the way with a patient-centric approach.Position Summary:

Want more jobs like this?

Get jobs in Hartford, CT delivered to your inbox every week.

By signing up, you agree to our Terms of Service & Privacy Policy.


We are currently seeking several experienced and highly skilled Staff Observability Operations Engineers with a strong background in Site Reliability Engineering (SRE), modern observability practices, and the management and implementation of modern observability and event management platforms. These roles are crucial in overseeing and optimizing our observability platform to ensure seamless and efficient operations. Responsibilities include deploying observability solutions, management and administration of observability and event management platforms, handling release management, system upgrades, patching, integrations, managing customer issues and requests, and troubleshooting incidents. Additionally, the roles involve continuous planning to enhance platform performance to support scalability and complexity. Successful candidates will play a key role in ensuring our observability infrastructure meets the current and future needs of CVS Health's dynamic environment.Key Responsibilities:Deployment and Implementation: Deploy and implement modern observability solutions to meet organizational needs. Ensure successful integration of observability, event management, and notification tools and technologies within the existing environment. Work with partners to migrate legacy monitoring to modern solutions. Work with the observability engineering team to provide solutions for new requirements that arise, by leveraging existing or developing new solutions.Platform Management: Manage and administer observability and event management platforms. Lead system upgrades, patching, and maintenance activities to ensure optimal performance and security.Release Management: Coordinate and manage release cycles for observability platforms. Ensure smooth and timely releases with minimal disruption to services.Incident/Request Management: Troubleshoot and resolve incidents related to observability platforms. Manage escalated customer issues and requests, ensuring timely and effective resolution. Document incident remediation activities to enable resolution by L1/L2 MSP partners; automate remediation activities where possible.Performance Optimization: Continuously monitor and enhance platform performance to support scalability and complexity. Utilize telemetry data to automate performance optimization and capacity planning.Collaboration and Communication: Collaborate with cross-functional infrastructure, application, and business stakeholders to ensure observability solutions align with the broader IT strategy and infrastructure requirements. Communicate effectively with team members, management, and other stakeholders.Continuous Improvement: Identify opportunities for process optimization and efficiency gains. Stay current with industry trends and best practices to continuously improve observability operations.Customer Focus: Ensure high levels of customer satisfaction by effectively managing customer relationships. Provide excellent customer service and support for observability solutions.Compliance and Security: Ensure observability platforms comply with organizational policies and security standards. Implement tools and processes to detect and remediate configuration drifts and security risks.Documentation and Reporting: Maintain comprehensive documentation of observability platform configurations, processes, and procedures. Generate and analyze reports on platform performance and capacity.Training and Mentoring: Provide training and mentoring to junior engineers, team members, and our MSPs. Share knowledge and best practices to enhance the overall capability of the team.Required Skills and Qualifications:Technical Expertise:
  • 7+ Years of experience in IT operations, with significant responsibilities in system monitoring, performance tuning, and troubleshooting enterprise applications.
  • 5+ Years in a Site Reliability Engineering (SRE) role deploying and managing modern observability solutions.
  • 5+ Years managing and implementing observability and event management platforms (e.g., AppDynamics, Splunk, Prometheus, Grafana).
  • Experience developing and administering ServiceNow ITOM event management solutions, ensuring seamless integration with observability tools.
  • Experience deploying and managing service reliability platforms (e.g., xMatters, OpsGenie, PagerDuty), configuring incident notifications, incident command workflows, and automating incident remediation workflows.
  • Experience with and deep knowledge of cloud environments, cloud monitoring platforms, and container orchestration tools (e.g., AWS/CloudTrail, Azure/Monitor, GCP/GCM, Kubernetes, OpenShift).
  • Proficiency in Python and other scripting languages such as Ansible, PowerShell, and Bash for automation and configuration. Experience with and passion for deploying things "as code".
Solution Implementation and Platform Management:
  • Hands-on experience deploying, managing, and administering observability platforms.
  • Hands-on experience leading, coordinating, and performing migration of application, platform, and infrastructure observability solutions (e.g., full-stack APM, RUM, Session Replay, Server, Storage, Network, Database, NLB, etc.) from legacy tools to modern platforms.
  • Hands on experience performing system upgrades, patching, and integrations to ensure platform stability and security.
  • Experience developing and implementing monitoring and logging standards for infrastructure, platforms, and applications.
  • Experience building and instrumenting dashboards to deliver technical and business process insights leveraging standard observability/BI platforms (e.g., AppDynamics, Grafana, Tableau, PowerBI).
  • Experience establishing and implementing event correlation policies and related rules to enrich event data, increase signal-to-noise-ratio for events, and reduce MTTD and MTTR.
Incident and Problem Resolution:
  • Excellent problem-solving skills, with the ability to handle multiple tasks, prioritize effectively, and work under pressure.
  • Proven ability to troubleshoot and resolve complex technical issues related to observability platforms.
  • Experience managing customer issues and requests, providing timely and effective solutions.
Performance Monitoring and Optimization:
  • Experience monitoring platform performance and implementing enhancements to support scalability and complexity.
  • Experience leveraging telemetry data to automate performance optimization and capacity planning.
  • Proficiency in scripting and programming languages such as Ansible, PowerShell, Bash, Python, YAML, XML, and JSON to automate deployment, configuration and instrumentation.
Release and Configuration Management:
  • Experience coordinating and managing release cycles for observability platforms.
  • Knowledge of best practices in release management to ensure smooth and timely deployments.
  • Experience configuring and leveraging source code management tools and workflows to manage and deploy Monitoring as Code.
Collaboration and Communication:
  • Excellent communication skills, both verbal and written.
  • Ability to collaborate effectively with cross-functional teams and stakeholders.
  • Strong interpersonal skills, with the ability to engage effectively with both technical teams and business stakeholders.
Continuous Improvement:
  • Commitment to continuous improvement and staying current with industry trends and best practices.
  • Ability to identify opportunities for process optimization and efficiency gains.
Customer Focus:
  • Strong customer service orientation with the ability to manage customer relationships effectively.
  • Experience in providing excellent customer service and support for observability solutions.
Compliance and Security:
  • Knowledge of compliance and security standards related to observability platforms.
  • Ability to implement tools and processes to detect and remediate configuration drift and security risks.
  • Experience managing operational data and systems access to ensure compliance with internal and external audit and regulatory requirements.
Documentation and Reporting:
  • Proficiency maintaining comprehensive documentation of observability platform configurations, processes, and procedures.
  • Ability to generate and analyze reports on platform performance, incidents, and customer requests.
Preferred Certifications
  • ITIL 4 Practitioner: Monitoring and Event Management
  • DevOps Institute Observability Foundation
  • DevOps Institute Site Reliability Engineering Foundation or Practitioner
  • ServiceNow CIS-Event Management Implementer
  • ServiceNow Certified Application Developer
  • xMatters Integrator
Education requirements
  • Bachelor degree from accredited university or equivalent work experience(HS diploma + 4 years relevant experience)
BUSINESS OVERVIEW

Bring your heart to CVS Health Every one of us at CVS Health shares a single, clear purpose: Bringing our heart to every moment of your health. This purpose guides our commitment to deliver enhanced human-centric health care for a rapidly changing world. Anchored in our brand - with heart at its center - our purpose sends a personal message that how we deliver our services is just as important as what we deliver. Our Heart At Work Behaviors™ support this purpose. We want everyone who works at CVS Health to feel empowered by the role they play in transforming our culture and accelerating our ability to innovate and deliver solutions to make health care more personal, convenient and affordable. We strive to promote and sustain a culture of diversity, inclusion and belonging every day. CVS Health is an affirmative action employer, and is an equal opportunity employer, as are the physician-owned businesses for which CVS Health provides management services. We do not discriminate in recruiting, hiring, promotion, or any other personnel action based on race, ethnicity, color, national origin, sex/gender, sexual orientation, gender identity or expression, religion, age, disability, protected veteran status, or any other characteristic protected by applicable federal, state, or local law. We proudly support and encourage people with military experience (active, veterans, reservists and National Guard) as well as military spouses to apply for CVS Health job opportunitiesJoin CVS Health as a Staff Observability Operations Engineer and contribute to our mission of driving health care innovation and delivering cutting-edge health solutions.Pay Range

The typical pay range for this role is:$130,295.00 - $260,590.00

This pay range represents the base hourly rate or base annual full-time salary for all positions in the job grade within which this position falls. The actual base salary offer will depend on a variety of factors including experience, education, geography and other relevant factors. This position is eligible for a CVS Health bonus, commission or short-term incentive program in addition to the base pay range listed above. This position also includes an award target in the company's equity award program.

In addition to your compensation, enjoy the rewards of an organization that puts our heart into caring for our colleagues and our communities. The Company offers a full range of medical, dental, and vision benefits. Eligible employees may enroll in the Company's 401(k) retirement savings plan, and an Employee Stock Purchase Plan is also available for eligible employees. The Company provides a fully-paid term life insurance plan to eligible employees, and short-term and long term disability benefits. CVS Health also offers numerous well-being programs, education assistance, free development courses, a CVS store discount, and discount programs with participating partners. As for time off, Company employees enjoy Paid Time Off ("PTO") or vacation pay, as well as paid holidays throughout the calendar year. Number of paid holidays, sick time and other time off are provided consistent with relevant state law and Company policies.

For more detailed information on available benefits, please visit Benefits | CVS HealthWe anticipate the application window for this opening will close on: 01/31/2025Qualified applicants with arrest or conviction records will be considered for employment in accordance with all federal, state and local laws.

Client-provided location(s): Hartford, CT, USA
Job ID: CVS-R0454836
Employment Type: Other

Perks and Benefits

  • Health and Wellness

    • Health Insurance
    • Dental Insurance
    • Vision Insurance
    • Life Insurance
    • HSA
    • HSA With Employer Contribution
    • Pet Insurance
    • Mental Health Benefits
  • Parental Benefits

    • Fertility Benefits
    • Adoption Assistance Program
    • Family Support Resources
  • Work Flexibility

    • Flexible Work Hours
    • Remote Work Opportunities
    • Hybrid Work Opportunities
  • Vacation and Time Off

    • Paid Vacation
    • Paid Holidays
    • Personal/Sick Days
  • Financial and Retirement

    • 401(K) With Company Matching
  • Professional Development

    • Tuition Reimbursement
  • Diversity and Inclusion

    • Employee Resource Groups (ERG)
    • Diversity, Equity, and Inclusion Program