Skip to main contentA logo with &quat;the muse&quat; in dark blue text.

Splunk Engineer

AT Leidos
Leidos

Splunk Engineer

Remote

Description

Leidos is hiring an energetic, motivated, innovative individual to be a part of our team supporting Center for Medicare and Medicaid Services (CMS) in Baltimore, MD. The Cloud SRE works closely with the Program team to manage, maintain, and optimize application's data and infrastructure that support CMS and the public. You will deliver solutions that ultimately ensure that the functions of Medicare, Medicaid, and Marketplace are carried out for the US citizen and contribute to efforts to reduce healthcare costs.

With a "no downtime, zero outages" vision and mantra, we support a range of data center and cloud based application needs ranging from self-service to white-glove services, all of which are based on our customer's required level of support.

Want more jobs like this?

Get Software Engineering jobs that are Remote delivered to your inbox every week.

By signing up, you agree to our Terms of Service & Privacy Policy.


The role of a Site Reliability Engineer will require you to develop solutions that are highly innovative and achieved through research and integration of best practices. Influence development of solutions that impact strategic project/program goals and business results while also leading work of other technical staff. You will resolve highly complex problems using significant application of technical knowledge, conceptualizing, reasoning, and interpretation. You will interact daily with various technical resources across different vendors which are fulfilling technical requirements for the customer.

Your goal will be to work with all stakeholders to help Leidos ensure delivery of high-quality, robust, and scalable solutions with minimal business impact. Lastly and most importantly, you will represent our program when meeting with the Application Development Organizations (ADO) and identify opportunities to provide support as well as modernization and innovation opportunities for their application.

The current work environment is remote leveraging various tools such as Slack, Microsoft Teams, and Zoom.

Primary Responsibilities

  • Successful candidate will be a member of a cross functional team comprised of well-rounded engineers who can learn new skills rapidly and work across multiple functional domains to carry out end-to-end delivery of infrastructure services.
  • Support the full system life-cycle of Splunk across geographically dispersed enterprise datacenters.
  • Customize queries, reports and dashboards.
  • Participate in architecture and on-going design meetings to ensure adequate logging while enabling business value and outcomes.
  • Monitor system stability and performance and ensure system availability, reliability, and usability.
  • Troubleshoot complex problems, resolving operational issues, software fault diagnosis, & interacting with vendors, etc.
  • Work closely with Leidos Engineering, Operations staff as well as the customer's application owners to solve technical problems at the network, system, and application levels.
  • Lead the team in all areas of telemetry and observability.
  • Responsible and accountable for managing and following up on incidents, changes, and application release problems through the management channels.
  • Participate in on-call rotation and respond to incident alerts.
  • Focus on proactivity and enablement of self-healing systems.
  • Serve as the expert in creation of KPI's and alerting thresholds for meaningful metrics relative to the health and performance of the applications the team manages.
  • Must be a team player, but able to work independently on large, complex projects and assignments in fast paced environment.
  • Provide leadership in problem determination/analysis, isolating system problems utilizing diagnostic and system management tools.
  • Always provide professional and courteous service with excellent verbal and written communications skills.
  • Model inclusive leadership to teammates by building diversity into activities and meetings.

Basic Qualifications:

  • BS degree in in computer science or some equivalent, highly technical discipline. Experience may be substituted in lieu of degree.
  • 5+ years in technical engineering relative to the responsibilities of Cloud Engineering and Site Reliability Engineering.
  • Strong background designing, deploying, and maintaining Splunk in a large, distributed environment.
  • Experience with Splunk Search Processing Language.
  • Experience creating Spunk dashboards.
  • Experience with IT Service Intelligence (ITSI).
  • General understanding of Splunk knowledge objects (e.g. fields, lookups, macros, etc.).
  • Through understanding of coding best practices, including knowing how to code, typically in a variety of languages, both in a structured and OOP way (e.g., Python, Golang, Ruby, C/C++).
  • Proficient in programming languages for automation (e.g., python) and shell scripting (e.g., bash).
  • Deep knowledge of version control (e.g., Git) and ability to create GitOps practices.
  • Extensive experience with configuring and maintaining monitoring and alerting tools such as Nagios, CloudWatch, Grafana, Prometheus, Splunk ITSI.
  • Proficient in incident management tools (e.g., Splunk On-Call, PagerDuty).
  • Experience with variety of relational and non-relational databases/RDS (e.g., DynamoDB, MongoDB, CosmoDB, PostgreSQL).
  • Strong and relevant experience in cloud technologies, cloud services, IaC, cloud storage, cloud networking and cloud security.
  • Strong knowledge and experience with Cloud IaaS, PaaS, and SaaS offerings.
  • Strong experience with automation and CI/CD tools (e.g., Argo, Jenkins, Travis, Ansible).
  • Knowledge of cloud-based security tools, best practices and policies including demonstrated experience protecting all layers of the application stack.
  • Proficient in DEVOPS tools, processes, and practice.
  • Is well versed in the development and implementation of automation scripts and processes.
  • Knowledge of the Software Delivery Life Cycle (SDLC).
  • Excellent writing and verbal communication skills.
  • Ability to manage conflict effectively.
  • Ability to adapt and be productive in a fast-paced dynamic environment.
  • Excellent communication and collaboration skills supporting multiple stakeholders and business operations.
  • Self-starter, self-managed, and a team player.

Preferred Qualifications

  • Cloud certification (e.g., AWS Solutions Architect Associate, Azure Administrator).
  • Certification as a Splunk Certified Architect or Splunk Certified Admin.
  • Experience with setting up self-healing components within an application's infrastructure.
  • Agile-based knowledge and skill, including experience with Scrum Ceremonies and work management tools (e.g., (JIRA, Confluence).
  • Security Skills-Knowledge of information assurance compliance and information security basics within CMS.

Required Clearance

  • Ability to obtain and maintain a Public Trust clearance.

All candidates supporting the CMS programs must have lived in the United States at least three (3) out of the last five (5) years prior in order to be considered.

Original Posting Date:

2024-07-24
While subject to change based on business needs, Leidos reasonably anticipates that this job requisition will remain open for at least 3 days with an anticipated close date of no earlier than 3 days after the original posting date as listed above.

Pay Range:

Pay Range $81,250.00 - $146,875.00

The Leidos pay range for this job level is a general guideline only and not a guarantee of compensation or salary. Additional factors considered in extending an offer include (but are not limited to) responsibilities of the job, education, experience, knowledge, skills, and abilities, as well as internal equity, alignment with market data, applicable bargaining agreement (if any), or other law.

#Remote

Job ID: Leidos-R-00140141
Employment Type: Full Time

Perks and Benefits

  • Health and Wellness

    • Health Insurance
    • Health Reimbursement Account
    • Dental Insurance
    • Vision Insurance
    • Life Insurance
    • Short-Term Disability
    • Long-Term Disability
    • FSA
    • HSA
    • Pet Insurance
    • Mental Health Benefits
  • Parental Benefits

    • Birth Parent or Maternity Leave
    • Fertility Benefits
    • Adoption Assistance Program
    • Family Support Resources
  • Work Flexibility

    • Flexible Work Hours
    • Remote Work Opportunities
    • Hybrid Work Opportunities
  • Office Life and Perks

    • Company Outings
    • On-Site Cafeteria
    • Holiday Events
  • Vacation and Time Off

    • Paid Vacation
    • Paid Holidays
    • Personal/Sick Days
    • Volunteer Time Off
  • Financial and Retirement

    • 401(K) With Company Matching
    • Stock Purchase Program
    • Performance Bonus
    • Relocation Assistance
    • Financial Counseling
    • Profit Sharing
  • Professional Development

    • Tuition Reimbursement
    • Promote From Within
    • Mentor Program
    • Access to Online Courses
    • Lunch and Learns
    • Internship Program
    • Leadership Training Program