Senior UNIX/Linux Recovery Engineer
Job Title
Job Summary:
We are seeking an experienced UNIX/Linux Recovery Engineer to join our team and take a lead role in creating, updating, and maintaining disaster recovery (DR) plans and point-in-time recovery procedures/documentation/solutions for our large-scale, complex global environments.
The ideal candidate will have extensive experience with Solaris, AIX, and Red Hat Enterprise Linux (RHEL) systems.
Key Responsibilities:
Disaster Recovery (DR) Planning:
Develop comprehensive DR plans for UNIX/Linux systems
Regularly review and update existing DR plans
Conduct risk assessments and business impact analyses
Want more jobs like this?
Get jobs in Pune, India delivered to your inbox every week.
Define recovery time objectives (RTO) and recovery point objectives (RPO) Collaborate with cross-functional teams to ensure DR plans align with business needs
Point-in-Time Recovery:
Design and implement point-in-time recovery solutions Optimize backup and recovery processes for minimal data loss Develop strategies for rapid system and data restoration
Documentation, Training, and Knowledge Transfer:
Develop and deliver comprehensive training programs for engineering and operations teams Create and maintain detailed technical documentation for all recovery processes
Develop standard operating procedures (SOPs) for DR and recovery tasks
Conduct knowledge transfer sessions and training for team members
Create and update runbooks for various recovery scenarios
Testing and Validation:
Conduct regular DR and PITR tests in sandbox environments
Analyze test results and implement improvements to recovery processes Maintain detailed documentation of test procedures, results, and lessons learned
Sandbox Environment:
Design and maintain a sandbox environment for testing DR and recovery procedures Regularly conduct DR drills and simulations in the sandbox
Use the sandbox to validate new recovery techniques and technologies
System Administration:
Manage and troubleshoot Solaris, AIX, and RHEL systems Implement and maintain backup and recovery solutions Monitor system performance and capacity
Apply security patches and updates
Tools and Infrastructure:
Evaluate, recommend, and implement DR and backup tools suitable for Windows environments Design and maintain sandbox environments for testing and training purposes
Continuously research and propose new technologies to enhance recovery capabilities
Compliance and Reporting:
Ensure DR and PITR plans meet industry standards and regulatory requirements
Prepare reports on recovery readiness, test results, and improvement initiatives for management
Incident Management:
Participate in the incident response team during actual DR scenarios Provide expert guidance on recovery procedures during critical situations Conduct post-incident reviews and implement lessons learned
Continuous Improvement:
Stay current with industry best practices and emerging technologies in DR and data recovery Propose and implement improvements to existing recovery processes
Analyze recovery metrics and suggest optimizations
Required Qualifications:
Bachelor's degree in Computer Science, Information Technology, or related field 7+ years of experience in UNIX/Linux system administration
Strong experience with Solaris, AIX, and RHEL operating systems
In-depth knowledge of disaster recovery principles and methodologies
Expertise in backup and recovery technologies (e.g., Cohesity, IBM Spectrum Protect) Experience with high-availability solutions and clustering technologies
Familiarity with virtualization platforms (e.g., VMware, KVM)
Strong scripting skills (e.g., Ansible, Bash, Python, Perl)
Excellent problem-solving and analytical skills Strong communication and documentation abilities
Preferred Qualifications:
Relevant certifications (e.g., RHCE, Solaris Certified System Administrator, IBM Certified System Administrator) Experience with cloud-based DR solutions (e.g., AWS, Azure, GCP)
Knowledge of storage technologies and SAN/NAS systems
Familiarity with change management and ITIL processes
Experience in a large enterprise or global environment
Exp Level - 8-18 years Required
Unix - Implementation and Maintenance