Skip to main contentA logo with &quat;the muse&quat; in dark blue text.

Lead Site Reliability Engineer

AT GM Financial
GM Financial

Lead Site Reliability Engineer

Arlington, TX

Overview

Why GMF Technology?

GM Financial is set to change the auto finance industry and is leading the path of embarking on tech modernization - we have a startup mindset, and preserve our small company culture, in a public company environment with financial stability and intense growth over a decade-plus history. We are data junkies and trust in data and insights to advance our business objectives. We take our goal of zero emission, zero collision, zero congestion, and zero friction very seriously. We believe as an auto finance market leader we are in the driver's seat to lead us in the GM EV mission to change the world. We are building global platforms, in LATAM, Europe, China, U.S. and Canada- and we are looking to grow our high-performing team. GMF is comprised of over 10,000 team members globally. Join our fintech culture within a Blue-Chip company where we are changing the way we use technology to support our customers, dealers and business.

Want more jobs like this?

Get jobs in Arlington, TX delivered to your inbox every week.

By signing up, you agree to our Terms of Service & Privacy Policy.


Flexible hybrid work environment (onsite 2 days a week/3 days remote) at our Arlington (AOC1), TX office.

Responsibilities

About this role

The Lead Site Reliability Engineer (SRE) will provide strategic leadership and direction for building and running large-scale software systems. This role involves identifying and delivering automation solutions to ensure high availability and resiliency, leveraging expertise in software development, complexity analysis, and scalable system design. The Lead SRE will work closely with other engineering teams to ensure services and systems are highly stable and performant, meeting the expectations of business partners and end users.

JOB DUTIES

• Lead architecture and development teams to ensure applications are highly available, reliable, and performant at a global scale.
• Partner with the architecture team to ensure operability, measurability, and manageability are integrated into business features and enablers.
• Collaborate with product owners and managers to establish service level objectives (SLOs) for applications and define consequences if objectives are not met.
• Work with development team members to identify monitoring gaps, improve application performance, and assist with troubleshooting issues.
• Drive Root Cause Analysis (RCA) of production issues and other failures within the product software, pipeline, or other DevOps support processes or technology.
• Design, build, and advocate for automated solutions to optimize application/service/platform uptime with minimal human intervention.
• Participate in an on-call rotation to support troubleshooting and communication efforts outside of normal business hours.
• Create and implement standards and best practices, driving adoption across development teams and external vendors as applicable.
• Perform other duties as assigned.
• Ensure compliance with all company policies and procedures.

Qualifications

What makes you a dream candidate?

• Proven leadership skills and the ability to guide and mentor a team.
• Strong collaboration and communication skills.
• A proactive approach to problem-solving and continuous improvement.
• Passion for automation and operational excellence.
• Deep expertise in cloud technologies and software development, with a strong technical background.

Knowledge and Skills

• Significant experience in C# or Java (C# preferred).
• Proficiency in SQL and Powershell.
• Expertise in defining, implementing, and evaluating Service Level Objectives (SLOs) and Service Level Indicators (SLIs), and associated consequences.
• Strong skills in performing Root Cause Analysis (RCA) and Problem Management.
• Extensive experience in cloud native applications Azure/AWS (monitoring, networking, containerization, infrastructure).
• Proficiency in containerization technologies such as Azure Kubernetes Service, Kubernetes (open source), and Docker.
• Knowledge of metrics and monitoring tools like Azure Application Insights and Azure Monitor.
• Familiarity with networking technologies relevant to Azure and AWS, including Azure DNS, Virtual Networks, Azure API Manager, Azure Application Gateway, Akamai WAF/CDN, AWS Route 53, AWS VPC, AWS API Gateway, and AWS CloudFront.
• Strong experience with Terraform for infrastructure as code.
• Ability to establish and maintain a culture of learning through the development and sharing of skills, knowledge, processes, and tools; combat traditional silos that create "us and them" environments.

Education and Experience

  • 5-7 years hands-on experience with supporting Linux production environments required
  • 5-7 years of hands-on administration experience on Spark required
  • hands-on experience in cloud technologies with Microsoft Azure required
  • 3-5 years hands-on experience with scripting with bash, perl, ruby, or python required
  • 3-5 years experience with Docker Datacenter required
  • 2-4 years of hands-on administration experience on Machine learning platforms required
  • Minimum of 1 year of experience in Mesos, Kubernetes, OpenShift and/or Deis or other such container/platform-as-a-service orchestrator required
  • Minimum of 1 year of hands-on experience on CICD tools & Technologies required
  • Minimum of 1 year of lead experience of site reliability engineering team required
  • Bachelor of Computer Science or related Engineering field; and/or commensurate experience
  • Master's Degree preferred

What We Offer: Generous benefits package available on day one to include: 401K matching, bonding leave for new parents (12 weeks, 100% paid), tuition assistance, training, GM employee auto discount, community service pay and nine company holidays.

Our Culture: Our team members define and shape our culture - an environment that welcomes innovative ideas, fosters integrity, and creates a sense of community and belonging. Here we do more than work - we thrive.

Compensation: Competitive pay and bonus eligibility

Work Life Balance: Flexible hybrid work environment, 2-days a week in office

#LI-SG1

#LI-Hybrid

Client-provided location(s): Arlington, TX, USA
Job ID: GM_Financial-48640
Employment Type: Full Time

Perks and Benefits

  • Health and Wellness

    • Health Insurance
    • Dental Insurance
    • Vision Insurance
    • Life Insurance
    • Short-Term Disability
    • Long-Term Disability
    • FSA
    • FSA With Employer Contribution
    • HSA
    • HSA With Employer Contribution
    • Mental Health Benefits
  • Parental Benefits

    • Birth Parent or Maternity Leave
    • Non-Birth Parent or Paternity Leave
    • Adoption Leave
  • Work Flexibility

    • Remote Work Opportunities
    • Hybrid Work Opportunities
  • Office Life and Perks

    • Happy Hours
    • Company Outings
    • On-Site Cafeteria
    • Holiday Events
  • Vacation and Time Off

    • Paid Vacation
    • Paid Holidays
    • Personal/Sick Days
    • Leave of Absence
    • Volunteer Time Off
  • Financial and Retirement

    • 401(K) With Company Matching
    • Performance Bonus
    • Profit Sharing
  • Professional Development

    • Tuition Reimbursement
    • Promote From Within
    • Mentor Program
    • Shadowing Opportunities
    • Access to Online Courses
    • Lunch and Learns
    • Internship Program
    • Leadership Training Program
  • Diversity and Inclusion

    • Unconscious Bias Training