Skip to main contentA logo with &quat;the muse&quat; in dark blue text.

Lead Cloud Engineer

AT UKG
UKG

Lead Cloud Engineer

Noida, India

Job Description: Linux, GCP Cloud, GCP IAM, Splunk, and Grafana Specialist
Position Overview:
Experienced Linux and Cloud Infrastructure Specialist with expertise in managing GCP Cloud services, GCP Identity and Access Management (IAM), monitoring systems with Splunk, and visualisation using Grafana. Responsible for deploying, securing, and optimizing cloud environments, monitoring infrastructure health, and automating processes to ensure robust, scalable, and secure operations.
Key Responsibilities:
Linux System Administration:
Administer, configure, and troubleshoot Linux environments (Ubuntu, CentOS, RHEL) both on-premises and in cloud environments.
Perform system upgrades, patch management, security hardening, and automation of administrative tasks using Shell scripting and Ansible.

Want more jobs like this?

Get Software Engineering jobs in Noida, India delivered to your inbox every week.

By signing up, you agree to our Terms of Service & Privacy Policy.

Monitor system performance, manage disk space, user accounts, and configure backup solutions.
Google Cloud Platform (GCP):
Design, implement, and maintain scalable cloud infrastructure on GCP using Compute Engine, Kubernetes, Cloud SQL, and Cloud Storage.
Architect and deploy resilient cloud solutions, ensuring high availability and disaster recovery plans.
Implement infrastructure as code (IaC) using Terraform or GCP Deployment Manager for automated provisioning and environment consistency.
GCP IAM (Identity and Access Management):
Manage and secure cloud access with GCP IAM, defining roles, policies, and permissions to adhere to the principle of least privilege.
Configure service accounts, OAuth2 authentication, and manage API access for applications.
Regularly audit IAM policies and ensure compliance with organizational security protocols.
Monitoring & Logging (Splunk & Grafana):
Set up, configure, and manage Splunk for log ingestion from multiple cloud and on-prem sources, create custom queries, dashboards, and alerts for real-time monitoring.
Analyze system and application logs in Splunk to troubleshoot issues and generate insights into system health.
Build and maintain Grafana dashboards for real-time visualization of system metrics, including CPU, memory usage, network traffic, and custom KPIs.
Create and manage alerting systems in both Grafana and Splunk to proactively monitor performance thresholds and security events.
Security & Compliance:
Implement security best practices for cloud environments, including IAM policies, firewalls, VPN configurations, and data encryption.
Conduct vulnerability assessments, log analysis, and continuous monitoring to ensure the security of cloud and on-prem systems.
Ensure compliance with regulatory standards and internal governance through proper audit logging and access control mechanisms.
Automation & Scripting:
Automate routine cloud and system operations with Shell scripts, Python, and tools like Terraform and Ansible.
Implement auto-scaling policies and load balancing solutions in GCP to optimize resource utilization.
Deploy and manage CI/CD pipelines for efficient software delivery and infrastructure updates.
Troubleshooting & Incident Management:
Actively monitor systems, troubleshoot performance issues, and resolve outages using logs, monitoring tools, and performance metrics.
Lead root cause analysis for critical incidents and drive system reliability improvements.
Collaborate with cross-functional teams to ensure optimal cloud performance and availability.
Qualification: Bachelor's degree or Master's degree in Information Systems, Information Security, or related fields
Requirements:
Experience:
6-7 years of hands-on experience in Linux system administration, GCP cloud services, IAM management, and monitoring tools like Splunk and Grafana.
Technical Skills:
Linux: Extensive experience with RHEL, CentOS, Ubuntu.
GCP: Proficient in Compute Engine, Kubernetes, Cloud Functions, VPC, Cloud SQL, and Storage.
IAM: Strong expertise in managing GCP IAM roles, policies, and service accounts.
Monitoring: Deep knowledge of Splunk for log management, alerting, and dashboard creation.
Grafana: Experience with creating and maintaining dashboards, integrating data sources, and alert management.
Automation: Scripting with Bash, Python, and experience with IaC tools like Terraform and Ansible.
BRS & SMTP: Knowledge on Backup and recovery Solution and SMTP would be a plus
Certifications (Preferred):
Google Cloud Professional or Cloud Security Engineer.
Splunk Core Certified Power User or Admin.
Red Hat Certified Engineer (RHCE).

Client-provided location(s): Noida, Uttar Pradesh, India
Job ID: ukg-893379875586
Employment Type: Other

Perks and Benefits

  • Health and Wellness

    • Health Insurance
    • Health Reimbursement Account
    • Dental Insurance
    • Vision Insurance
    • Life Insurance
    • Short-Term Disability
    • Long-Term Disability
    • FSA
    • FSA With Employer Contribution
    • HSA
    • HSA With Employer Contribution
    • Fitness Subsidies
    • On-Site Gym
    • Virtual Fitness Classes
  • Parental Benefits

    • Birth Parent or Maternity Leave
    • Non-Birth Parent or Paternity Leave
    • Adoption Assistance Program
    • Family Support Resources
    • Adoption Leave
  • Work Flexibility

    • Flexible Work Hours
    • Remote Work Opportunities
    • Hybrid Work Opportunities
  • Office Life and Perks

    • Casual Dress
    • Happy Hours
    • Company Outings
    • Holiday Events
  • Vacation and Time Off

    • Paid Vacation
    • Unlimited Paid Time Off
    • Paid Holidays
    • Personal/Sick Days
    • Volunteer Time Off
  • Financial and Retirement

    • 401(K) With Company Matching
    • Company Equity
    • Performance Bonus
    • Profit Sharing
  • Professional Development

    • Tuition Reimbursement
    • Mentor Program
    • Shadowing Opportunities
    • Access to Online Courses
    • Internship Program