Overview
This is a remote role that may only be hired in the following location(s): AZ, NC, NJ or TX
Site Reliability Engineering (SRE) at First Citizens combines software and systems engineering to build and manage First Citizens Bank's most critical applications, ensuring they run reliably, efficiently, and at scale. As the leader of the SRE team, you will be the at the forefront of setting the vision and amplifying the culture of SRE to the rest of the organization.
Responsibilities
- Lead a highly talented team that owns the availability, performance and reliability of customer-facing systems
- Build the roadmap, mission, and strategy for Site Reliability Engineering, supported with clear goals and KPIs
- Own SRE processes for SRE Engagement, SLO Management, Observability, Incident Management and other SRE concerns
- Establish strong relationships across engineering, operations, support, and product teams to enable Site Reliability efforts and focus
- Communicate vision and strategy to executive leaders across the Bank to build support, consensus, and advocacy for the practice
Want more jobs like this?
Get jobs in Lakewood, NJ delivered to your inbox every week.
Bachelor's Degree and 8 years of experience in Applications development, analysis or engineering OR High School Diploma or GED and 12 years of experience in Applications development, analysis or engineering
Preferred Qualifications
- 6+ years of experience in Software Engineering and/or Site Reliability Engineering background
- 4+ years of experience implementing / following SRE practices
- 4+ years of experience leading SRE teams
- Experience with application modernization technologies including Kubernetes, Serverless, Automation, Observability and Developer Operations.
- Understand performance and availability requirements and have experience working with Software Engineering teams to define deployment, configuration and monitoring requirements
- Excellent written and verbal communication skills
- Ability to create meaningful metrics and alerting for service health monitoring
- Proficiency driving Root Cause Analyses to meaningful improvements
- Leading troubleshooting efforts with production/non-production systems
- Excellent problem-solving and communication skills and a sense of ownership.
- Experience working within a Large Financial Institution (or similarly complex environment)