Skip to main contentA logo with &quat;the muse&quat; in dark blue text.
Siemens Digital Industries Software

Incident Manager (Remote)

Monterrey, Mexico

We are a leading global software company dedicated to the world of computer aided design, 3D modeling and simulation- helping innovative global manufacturers design better products, faster! With the resources of a large company, and the energy of a software start-up, we have fun together while creating a world class software portfolio. Our culture encourages creativity, welcomes fresh thinking, and focuses on growth, so our people, our business, and our customers can achieve their full potential.

The DISW SRE organization is dedicated to enhancing service and application availability, optimizing processes by automating manual and repetitive tasks, and addressing complex technical challenges in a dynamic, collaborative, inclusive, and iterative environment. This position plays a crucial role in developing automated solutions and processes that support and sustain best-in-class cloud-based applications.

Want more jobs like this?

Get jobs in Monterrey, Mexico delivered to your inbox every week.

By signing up, you agree to our Terms of Service & Privacy Policy.


Position Overview

The candidate will support the Siemens Xcelerator platform and will be for coordinating major incident response, maintaining stakeholder communication during service-impacting events, and facilitating resolution in compliance with service level agreement (SLA). A strong communication & coordination skills are necessary to support core objectives. This roles success will be defined by product teams within DISW business units meeting their SLAs.

Responsibilities/Tasks

  • Incident Management: Act as the primary point of contact and leader during major incidents, coordinating the response, communication, and resolution efforts across all involved teams.
  • Incident Response: Quickly assess the severity of incidents, determine the impact, and drive the appropriate response to restore services as quickly as possible.
  • Communication: Ensure clear, concise, and timely communication with stakeholders, including technical teams, management, and customers, throughout the incident lifecycle.
  • Post-Incident Analysis: Lead post-incident reviews to identify root causes, drive improvements, and implement preventive measures to reduce the likelihood of recurrence.
  • Collaboration: Work closely with SRE, DevOps, Development, and other relevant teams to ensure that incident management processes are well-defined and continuously improved.
  • Training & Preparedness: Conduct regular incident response drills, train teams on incident management processes, and ensure readiness for handling high-severity incidents.
  • Documentation: Maintain and update incident management documentation, ensuring that all procedures are up-to-date and accessible to all relevant teams.
  • Monitoring & Alerts: Collaborate with SRE and monitoring teams to define and refine alerting criteria, ensuring that incidents are detected and escalated promptly.
  • Continuous Improvement: Identify opportunities to improve system reliability, scalability, and performance based on lessons learned from incidents.
  • 24x7 On-call rotation: Participate in 24x7 on-call rotation (Just during the day/ not night)

Required Knowledge/Skills, Education, and Experience

  • Communication: Outstanding English communication skills, both verbal and written, as well as, listening and synthesis skills.
  • Incident Response: Quickly assess the severity of incidents, determine the impact, and drive the appropriate response to restore services as quickly as possible.
  • Problem-Solving: Excellent troubleshooting and problem-solving skills, with the ability to quickly analyze complex systems.
  • Calm Under Pressure: Ability to remain calm, focused, and effective in high-pressure situations. The ability to make quick, confident decisions.
  • Leadership: Demonstrated experience in leading incident response efforts and managing cross-functional teams during critical situations.
  • Technical Skills: Familiar with Jira Service Mgmt (or equivalent ie. ServiceNOW), Datadog (or equivalent ie. Grafana), PagerDuty (or equivalent), Atlassian Statuspage (or equivalent).
  • Driven Learner: Highly motivated and driven to learn new technologies, skillsets, and methodologies, continuously seeking to expand your knowledge and adapt to evolving industry trends.

Preferred Knowledge/Skills, Education, and Experience

  • Certifications: Relevant certifications (e.g., AWS Certified Solutions Architect, Certified Kubernetes Administrator) are a plus.
  • Experience with Incident Command Systems (ICS): Familiarity with structured incident response frameworks, such as Incident Command Systems, is highly desirable.
  • Automation: Experience with automation tools and scripting languages (e.g., Python, Bash) to streamline incident response and remediation.
  • Culture of Learning: Passion for fostering a culture of learning and continuous improvement within the organization.
  • Experience: Enterprise IT environment with distributed environments
  • Technical Skills: Familiar with cloud infrastructure (AWS, GCP, Azure), containerization (Docker, Kubernetes)

Why us?

Working at Siemens Software means flexibility - Choosing between working at home and the office at other times is the norm here. We offer great benefits and rewards, as you'd expect from a world leader in industrial software.

A collection of over 377,000 minds building the future, one day at a time in over 200 countries. We're dedicated to equality, and we welcome applications that reflect the diversity of the communities we work in. All employment decisions at Siemens are based on qualifications, merit, and business need. Bring your curiosity and creativity and help us shape tomorrow!

Siemens Software. Transform the Everyday

#LI-PLM

#LI-AL1

#LI-HYBRID

#SWSaaS

Client-provided location(s): Monterrey, Nuevo Leon, Mexico
Job ID: Siemens_Digital-453697-en-2
Employment Type: Other

Perks and Benefits

  • Health and Wellness

    • Health Insurance
    • Health Reimbursement Account
    • Dental Insurance
    • Vision Insurance
    • Life Insurance
    • Short-Term Disability
    • Long-Term Disability
    • FSA
    • FSA With Employer Contribution
    • HSA
    • HSA With Employer Contribution
    • Fitness Subsidies
    • On-Site Gym
    • Pet Insurance
    • Mental Health Benefits
    • Virtual Fitness Classes
  • Parental Benefits

    • Birth Parent or Maternity Leave
    • Non-Birth Parent or Paternity Leave
    • Family Support Resources
    • On-site/Nearby Childcare
    • Adoption Leave
  • Work Flexibility

    • Flexible Work Hours
    • Remote Work Opportunities
    • Hybrid Work Opportunities
    • Work-From-Home Stipend
  • Office Life and Perks

    • Commuter Benefits Program
    • Casual Dress
    • Happy Hours
    • Snacks
    • Some Meals Provided
    • Company Outings
    • On-Site Cafeteria
    • Holiday Events
  • Vacation and Time Off

    • Paid Vacation
    • Unlimited Paid Time Off
    • Paid Holidays
    • Personal/Sick Days
    • Sabbatical
    • Leave of Absence
    • Volunteer Time Off
  • Financial and Retirement

    • 401(K)
    • 401(K) With Company Matching
    • Pension
    • Company Equity
    • Stock Purchase Program
    • Performance Bonus
    • Relocation Assistance
    • Financial Counseling
    • Profit Sharing
  • Professional Development

    • Tuition Reimbursement
    • Learning and Development Stipend
    • Promote From Within
    • Mentor Program
    • Shadowing Opportunities
    • Access to Online Courses
    • Lunch and Learns
    • Internship Program
    • Work Visa Sponsorship
    • Leadership Training Program
    • Associate or Rotational Training Program