Introduction
A career in IBM Software means you'll be part of a team that transforms our customer's challenges into solutions.
Seeking new possibilities and always staying curious, we are a team dedicated to creating the world's leading AI-powered, cloud-native software solutions for our customers. Our renowned legacy creates endless global opportunities for our IBMers, so the door is always open for those who want to grow their career.
IBM's product and technology landscape includes Research, Software, and Infrastructure. Entering this domain positions you at the heart of IBM, where growth and innovation thrive.
We are looking for a skilled Infrastructure Operations Engineer with expertise in Linux systems, networking, automation, Kubernetes, and orchestration tools. The ideal candidate will have hands-on experience managing Linux environments and automating infrastructure tasks using tools such as Ansible, Jenkins, and scripting. This role will be responsible for ensuring system reliability, automating repetitive tasks, and supporting deployment and maintenance of applications running on the platform.
Want more jobs like this?
Get jobs in Bangalore, India delivered to your inbox every week.
Your Role and Responsibilities
As a Site Reliability Engineer, you will work in an agile, collaborative environment to build, deploy, configure, and maintain systems for the IBM client business. In this role, you will lead the problem resolution process for our clients, from analysis and troubleshooting, to deploying the latest software updates & fixes.
We are looking for a dynamic Site Reliability Engineer to join our Cloud IaaS Team in Bengaluru, India, who is responsive to market needs, to deliver value to our clients in a fast-changing cloud landscape. The SRE team dedicated to ensuring that the IBM Cloud is at the forefront of cloud technology, from data centre design, Storage & Network architecture and compute clusters to flexible infrastructure services. We are building IBM's next generation cloud platform to deliver performance and predictability for our customers' most demanding workloads, at global scale and with leadership efficiency, resiliency and security. It is an exciting time, and as a team we are driven by this incredible opportunity to thrill our clients.
Role and Responsibilities:
- Manage and maintain Linux-based systems across multiple environments.
- Automate provisioning, configuration, and deployment tasks using tools like Ansible and Jenkins
- Design, implement, and manage deployment of containerized applications using Kubernetes and docker.
- Monitor and troubleshoot system performance, network issues, and applications to ensure optimal uptime and efficiency.
- Harden the server from scratch using baseboard management controller (BMC)s.
- Implement and maintain security best practices, ensuring compliance with company policies.
- Proactively identify potential improvements to processes and systems.
- Analyze and fix network & DNS issues in the environment.
- Upgrade Kubernetes worker nodes and packages without interrupting the cluster.
- Maintain benchmarking standards on systems to ensure continuous compliance.
- Participate in on-call rotation to support critical infrastructure issues.
Required Technical and Professional Expertise
- In addition to your strong verbal and written communication skills, you'll possess....
- Bachelor's degree in computer science, Information Technology, or a related field (or equivalent work experience)
- 8+ years of experience managing Linux systems in a production environment.
- Strong hands-on expertise with automation tools such as Ansible and Jenkins.
- Hands-on experience with Kubernetes and containerization (e.g., Docker).
- Familiarity with CI/CD pipelines and DevOps methodologies.
Preferred Technical and Professional Expertise
- None