Introduction
Site Reliability Engineering (SRE) professionals are engineers who specialize in reliability and resiliency with the right mix of knowledge and skills in software and systems, responsible to analyze business needs, problem determination, advise & design, build, test, deploy, changes and maintenance of a well-engineered information system and ecosystems.
We're seeking skilled, automation-focused SREs to maintain and administer the PowerVS CloudInfrastructure-as-a-Service environment and provide reliable and secure offering to clients.
Your role and responsibilities
As a Compute Operations Site Reliability Engineer, you will perform the following tasks:
• Remotely administer Power Server hardware environments across numerous data center locations around the world (currently 20 data centers and growing).
Want more jobs like this?
Get jobs in Alajuela, Costa Rica delivered to your inbox every week.
• Develop automation to reduce manual toil (automated, repetitive tasks) using shell scripts (bash, etc), Python, Ansible, and related tools and languages.
• Perform code stack updates on infrastructure systems (VIOS, firmware, PowerVC, HMC, Novalink, NIM servers) as well as cloud supporting systems (jump servers, sobox, network nodes, gateways, TSM servers).
• Upload/maintain stock images.
• Remotely administer AIX and Linux servers
• Maintain User IDs (Add/delete) and passwords.
• Monitor daily/weekly backups to ensure they are working.
• Manage and maintain Nagios monitoring environment, troubleshoot scripts/plug-ins incase of issues.
• Perform periodic Live Partition migrations, inactive migrations, or remote restarts of customer VMs to perform system maintenance, balance workloads, or free up resources.
• Monitor and provide details of Capacity utilized in each Datacenter.
• Attend scheduled meetings planned by customer for cutover/maintenance windows.
• Verify capacity requirements in case of provisioning failure issues by customers.
• Work with customers to resolve any RSCT issues so that LPM activities can be performed without impacting customer workloads.
Required education
Bachelor's Degree
Preferred education
Bachelor's Degree
Required technical and professional expertise
• In-depth knowledge of Power Server hardware.
• Significant scripting/coding experience for automating all aspects of IBM Power systems administration.
• Automation using Python, shell scripting (bash, etc), Ansible, and related tools and languages.
• Experience with AIX and Linux administration, commands, and networking.
• Strong experience in one or more of the following: VIO, Novalink, and PowerVC.Familiarity with one more (to include installation, configuration, administration).
• In-depth knowledge of PowerVM including installation/configuration and administration.
• High level knowledge of Power Systems supported Operating Systems (AIX and IBMi).
• In-depth knowledge of how storage is connected and allocated to Power systems viaNPIV connections.
Preferred technical and professional experience
• Experience with configuring and tuning PowerVC
• Access PowerVS resources using IBM Cloud Portal,
• IBM Cloud CLI, APIs, Terraform
• 3+ years' experience supporting customers using ServiceNow or Salesforce.
• Experience training new personnel on tooling and processes.
• Storage & Power RTS, MVS Network for Cisco, Juniper; general support skills
ABOUT BUSINESS UNIT
IBM Systems helps IT leaders think differently about their infrastructure. IBM servers and storage are no longer inanimate - they can understand, reason, and learn so our clients can innovate while avoiding IT issues. Our systems power the world's most important industries and our clients are the architects of the future. Join us to help build our leading-edge technology portfolio designed for cognitive business and optimized for cloud computing.
YOUR LIFE @ IBM
In a world where technology never stands still, we understand that, dedication to our clients success, innovation that matters, and trust and personal responsibility in all our relationships, lives in what we do as IBMers as we strive to be the catalyst that makes the world work better.
Being an IBMer means you'll be able to learn and develop yourself and your career, you'll be encouraged to be courageous and experiment everyday, all whilst having continuous trust and support in an environment where everyone can thrive whatever their personal or professional background.
Our IBMers are growth minded, always staying curious, open to feedback and learning new information and skills to constantly transform themselves and our company. They are trusted to provide on-going feedback to help other IBMers grow, as well as collaborate with colleagues keeping in mind a team focused approach to include different perspectives to drive exceptional outcomes for our customers. The courage our IBMers have to make critical decisions everyday is essential to IBM becoming the catalyst for progress, always embracing challenges with resources they have to hand, a can-do attitude and always striving for an outcome focused approach within everything that they do.
Are you ready to be an IBMer?
ABOUT IBM
IBM's greatest invention is the IBMer. We believe that through the application of intelligence, reason and science, we can improve business, society and the human condition, bringing the power of an open hybrid cloud and AI strategy to life for our clients and partners around the world.
Restlessly reinventing since 1911, we are not only one of the largest corporate organizations in the world, we're also one of the biggest technology and consulting employers, with many of the Fortune 50 companies relying on the IBM Cloud to run their business.
At IBM, we pride ourselves on being an early adopter of artificial intelligence, quantum computing and blockchain. Now it's time for you to join us on our journey to being a responsible technology innovator and a force for good in the world.
OTHER RELEVANT JOB DETAILS
For additional information about location requirements, please discuss with the recruiter following submission of your application.