Skip to main contentA logo with &quat;the muse&quat; in dark blue text.
IBM

SRE

Bangalore, India

Introduction
At IBM, work is more than a job - it's a calling: To build. To design. To code. To consult. To think along with clients and sell. To make markets. To invent. To collaborate. Not just to do something better, but to attempt things you've never thought possible. Are you ready to lead in this new era of technology and solve some of the world's most challenging problems? If so, lets talk.

Your Role and Responsibilities

  • Implement and automate infrastructure solutions that support IBM Cloud products and infrastructure
  • Developing and Administer CI/CD systems and tools for development and test teams
  • Keeping your assigned site or service up and running or getting it back up and running quickly when failure occurs
  • Working closely with internal partners and teams to ensure that our infrastructure meets security, SLA, and performance requirements
  • Automating work including infrastructure needs, testing, failover solutions, failure mitigation, and much more
  • Persistent testing of application and infrastructure resiliency over a variety of error conditions.
  • Support the compliance and security integrity of the environment
  • Develop, communicate, and monitor standard processes to promote the long-term health of sustainability and health of operational development tasks.
  • Standup and maintain pre-production and developer environments to support the entire development organization and improve overall team velocity
  • Use metrics and analytics to determine reliability issues and remove them through automation and tooling
  • Be an advocate for our customers, providing them self-diagnosing tools to resolve common issues that arise in the field
  • Required to participate in code reviews for your peers' development work, triage and solve live customer issues, and participate in all scrum activities
  • Additionally, monitor, measure, and improve code and data performance for the application you help to develop
  • Available for on-call shifts during daytime hours and weekends
  • All of this will take place in a strong team environment, which necessitates strong communication

Want more jobs like this?

Get Software Engineering jobs in Bangalore, India delivered to your inbox every week.

By signing up, you agree to our Terms of Service & Privacy Policy.

Required Technical and Professional Expertise

  • 4-8 years of experience delivering code for active Cloud Services/Projects
  • Experience debugging complex problems
  • Experience designing, building, and operating large-scale production systems
  • Expertise in Ansible, Bash, core Python development, and deployments in production environment is a must.
  • Experience automating infrastructure, configuration management, testing, and deployments using tools like Ansible, Chef and can explain the Infrastructure as Code paradigm
  • A strong understanding of diverse infrastructure platforms and infrastructure concepts required.
  • Systems management experience in Linux/UNIX systems (RHEL preferred)
  • Experience in Docker and containerization technologies
  • Experience with cloud computing technologies
  • Experience with k8s CRDs, k8s controller programming with watcher informer model
  • Must have good experience in Infrastructure Operations automation and IT Service Management with hands on exposure in data center administration, configuration , Incident management and support
  • 5+ years of working knowledge with one or more operating systems: Ubuntu (Preferred), RHEL, CentOS Linux, and Windows Servers
  • Strong experience with one or more Virtualization technologies: KVM, Xen, Citrix Hypervisor, VMware vSphere, etc.
  • Working knowledge with one or more programming tools: Bash, PowerShell, Python, Ruby and Go.
  • Strong Communication skills

Preferred Technical and Professional Expertise

  • Working knowledge with one or more key infrastructure tools/products: Ansible, Chef, etc.
  • Working knowledge with Container technologies: Kubernetes, Docker, etc.
  • Working knowledge with Monitoring technologies: Zabbix, Splunk, etc.
  • Working knowledge with ServiceNow, JIRA, Confluent, and GitHub
  • Must have good experience in Infrastructure Operations automation and IT Service Management with hands on exposure in data center administration, configuration , Incident management and support
  • Experience with technologies enabling reliable data processing pipelines such as Kafka, Elasticsearch, Splunk; database and data visualization technologies for operations such as SQL dbs, Influxdb, Grafana, Kibana.
  • Experience with event monitoring/management ecosystems like Zabbix, Nagios, Sysdig, LogDNA, ServiceNow.

Client-provided location(s): Bengaluru, Karnataka, India
Job ID: IBM-20962153
Employment Type: Full Time

Company Videos

Hear directly from employees about what it is like to work at IBM.