Skip to main contentA logo with &quat;the muse&quat; in dark blue text.

Principal Reliability Engineering/SRE

AT The Hartford
The Hartford

Principal Reliability Engineering/SRE

Columbus, OH

Principal Reliability Engineering - IE06JE

We're determined to make a difference and are proud to be an insurance company that goes well beyond coverages and policies. Working here means having every opportunity to achieve your goals - and to help others accomplish theirs, too. Join our team as we help shape the future.

.

The Hartford's Corporate / HIMCO IT team is seeking an experienced and highly motivated Principal Engineer who will be responsible for driving Reliability Engineering for multiple applications in the portfolio as well as implementation of Gen AI and AI platform capabilities. The principal engineer will be responsible for building, optimizing, and maintaining the cloud automation capabilities to enable infrastructure provisioning, application availability, testing, quality, application deployment, resiliency, recovery, and efficiency of IT applications and platforms.

Want more jobs like this?

Get jobs in Columbus, OH delivered to your inbox every week.

By signing up, you agree to our Terms of Service & Privacy Policy.


The principal engineer will also ensure the implementation of IT Security and service hardening requirements. Key measures of success will include service reliability (such as availability, latency, quality), as well as technical debt reduction and cost efficiency.

This role will have a Hybrid work schedule, with the expectation of working in an office (Hartford, CT or Charlotte, NC) 3 days a week. Candidate must be authorized to work in the US without company sponsorship. The company will not support the STEM OPT I-983 Training Plan endorsement for this position.

Responsibilities:

  • Set the strategy and advance the use of best-in-class software engineering standards, tools, and design practices to enable highly available and performant customer-facing applications. Lead adoption of metrics of overall application health - availability, performance, monitoring, alerting, quality, currency and resiliency
  • Technical expert for the applications and infrastructure supported, requiring depth and breadth of knowledge in technologies, applications, integration, interfaces and business domain.
  • Drive the development and implementation of Gen AI and AI platform capabilities, including evaluating and selecting AI and ML frameworks platforms and tools. Leverage cutting-edge technologies and methodologies to optimize business operations, enhance customer experience, and drive competitive advantage
  • Develop the strategy to ensure effective tooling, alerts, and response mechanisms to identify and address reliability and security risks leveraging automation to support problem prevention, detection, mitigation, and resolution.
  • Develop the strategy to enhance the velocity of the SDLC by engineering the appropriate solutions to increase delivery speed while adhering to technology standards for sustained reliability.
  • Identify, define and implement preventative controls and drive increased automation and self-healing capabilities. Continue to improve cost efficiency baselines.
  • Lead the migration of applications to open source platforms, PaaS, containers, serverless, event-based designs, and other cloud technology standards for cloud-enablement and platform agility.
  • Set a strategy to drive simplification across the stack, responsible for ensuring that all technical designs can be effectively operated in a cost-efficient manner, without adding operational complexity.
  • Lead inner- and open-sourcing practices to accelerate the development of self-service enterprise capabilities
  • Expert experience in setting up scalable SDLC environments using COTS, PaaS, SaaS products catering to Data, Application and Infrastructure-based pipeline needs
  • Design a migration plan which build solutions to drive applications to open-source platforms, PaaS and use of containers and other cloud technology standards for cloud-enablement and platform agility.
  • Ensure operational excellence. Lead the triaging and service restoration of all high impact incidents in order to minimize the mean time to service restoration and impact to the business. Demonstrate end-to-end ownership.
  • Partner with infrastructure teams on strategy to design and implement intelligent automation and orchestration systems, enhanced monitoring/alerting capabilities and rapid service restoration processes. Take proactive measures to prevent high impactful incidents.

Qualifications:

  • Bachelor Degree in Computer Science or related discipline
  • 10+ years of work experience in IT systems analysis, design, application development, IT Operations, and tech leadership.
  • 5+ years of experience in a Reliability Engineer, Multi Stack Engineer or Data Engineer role with Manager Accountabilities
  • Proven Experience with FinOps
  • System Thinking end-to-end - Broad understanding/application of enterprise architectures and complex distributed systems
  • 2+ years of experience in leading AI and ML Engineering organizations with expertise in building and/or managing large-scale AI, data and analytics platforms desirable
  • Knowledge about the principles and practices of FMOps and LLMOps, and the tools and technologies used for generative AI model operations desirable
  • Understanding of GenAI, machine learning, and related technologies along with business acumen.
  • Proven experience with solution architecture orientation to enable expedient troubleshooting, issue-resolution and root-cause removal in a hybrid cloud environment.
  • Proven experience with continuous integration and DevOps methodologies, tools including GitHub, Jenkins, Nexus, Rally, SonarQube, Jira, Azure DevOps, AWE Code Pipeline.
  • Proven experience using Performance and Observability tools such as DynaTrace, CloudWatch, CloudTrail, AWS X-Ray, and related tools.
  • Proven hybrid cloud experience (private and public) across various service delivery models - IaaS, PaaS, SaaS.
  • Proven experience with IAC tools such as Terraform, Cloud Formation etc.
  • Highly collaborative, partners with peers, stakeholders with a passion about delighting customers.
  • Strong communicator at all levels in the Enterprise (verbal and written) / Influence/negotiation skills, working in a diverse team cross business units
  • Certified in one ore more of the following:
    • AWS Certified Developer
    • AWS Certified Solution Architect
    • AWS Certified DevOps Engineer
    • Certified Kubernetes Administrator (CKA)
    • Certified Kubernetes Application Developer (CKAD)

Compensation

The listed annualized base pay range is primarily based on analysis of similar positions in the external market. Actual base pay could vary and may be above or below the listed range based on factors including but not limited to performance, proficiency and demonstration of competencies required for the role. The base pay is just one component of The Hartford's total compensation package for employees. Other rewards may include short-term or annual bonuses, long-term incentives, and on-the-spot recognition. The annualized base pay range for this role is:

$151,280 - $226,920

Equal Opportunity Employer/Females/Minorities/Veterans/Disability/Sexual Orientation/Gender Identity or Expression/Religion/Age

About Us | Culture & Employee Insights | Diversity, Equity and Inclusion | Benefits

Client-provided location(s): Columbus, OH, USA
Job ID: hartford-R2520272_Columbus
Employment Type: Full Time

Perks and Benefits

  • Health and Wellness

    • Health Insurance
    • Health Reimbursement Account
    • Dental Insurance
    • Vision Insurance
    • Life Insurance
    • Short-Term Disability
    • Long-Term Disability
    • On-Site Gym
    • Mental Health Benefits
    • Virtual Fitness Classes
    • Fitness Subsidies
    • FSA
    • HSA
  • Parental Benefits

    • Birth Parent or Maternity Leave
    • Non-Birth Parent or Paternity Leave
    • Fertility Benefits
    • Adoption Assistance Program
    • Family Support Resources
    • Adoption Leave
  • Work Flexibility

    • Hybrid Work Opportunities
    • Remote Work Opportunities
    • Flexible Work Hours
  • Office Life and Perks

    • Commuter Benefits Program
    • Casual Dress
    • On-Site Cafeteria
    • Company Outings
    • Holiday Events
  • Vacation and Time Off

    • Paid Vacation
    • Paid Holidays
    • Volunteer Time Off
    • Personal/Sick Days
  • Financial and Retirement

    • 401(K) With Company Matching
    • Stock Purchase Program
    • Performance Bonus
    • Relocation Assistance
    • Financial Counseling
    • Profit Sharing
  • Professional Development

    • Internship Program
    • Leadership Training Program
    • Associate or Rotational Training Program
    • Tuition Reimbursement
    • Promote From Within
    • Mentor Program
    • Shadowing Opportunities
    • Access to Online Courses
    • Lunch and Learns
    • Learning and Development Stipend
  • Diversity and Inclusion

    • Employee Resource Groups (ERG)
    • Diversity, Equity, and Inclusion Program

Company Videos

Hear directly from employees about what it is like to work at The Hartford.