Skip to main contentA logo with &quat;the muse&quat; in dark blue text.

Data Engineer

AT Cummins
Cummins

Data Engineer

Pune, India

DESCRIPTION

Although the role category specified in the GPP is Remote, the requirement is for Hybrid.

Key Responsibilities:

  • Implement and automate deployment of distributed systems for ingesting and transforming data from various sources (relational, event-based, unstructured).
  • Continuously monitor and troubleshoot data quality and integrity issues.
  • Implement data governance processes and methods for managing metadata, access, and retention for internal and external users.
  • Develop reliable, efficient, scalable, and quality data pipelines with monitoring and alert mechanisms using ETL/ELT tools or scripting languages.
  • Develop physical data models and implement data storage architectures as per design guidelines.
  • Analyze complex data elements and systems, data flow, dependencies, and relationships to contribute to conceptual, physical, and logical data models.
  • Participate in testing and troubleshooting of data pipelines.
  • Develop and operate large-scale data storage and processing solutions using distributed and cloud-based platforms (e.g., Data Lakes, Hadoop, Hbase, Cassandra, MongoDB, Accumulo, DynamoDB).
  • Use agile development technologies, such as DevOps, Scrum, Kanban, and continuous improvement cycles, for data-driven applications.

Want more jobs like this?

Get jobs in Pune, India delivered to your inbox every week.

By signing up, you agree to our Terms of Service & Privacy Policy.

RESPONSIBILITIES

Qualifications:

  • College, university, or equivalent degree in a relevant technical discipline, or relevant equivalent experience required.
  • This position may require licensing for compliance with export controls or sanctions regulations.

Competencies:

  • System Requirements Engineering: Uses appropriate methods and tools to translate stakeholder needs into verifiable requirements.
  • Collaborates: Building partnerships and working collaboratively with others to meet shared objectives.
  • Communicates effectively: Developing and delivering multi-mode communications that convey a clear understanding of the unique needs of different audiences.
  • Customer focus: Building strong customer relationships and delivering customer-centric solutions.
  • Decision quality: Making good and timely decisions that keep the organization moving forward.
  • Data Extraction: Performs ETL activities from various sources and transforms them for consumption by downstream applications and users.
  • Programming: Creates, writes, and tests computer code, test scripts, and build scripts using industry standards and tools.
  • Quality Assurance Metrics: Applies measurement science to assess whether a solution meets its intended outcomes.
  • Solution Documentation: Documents information and solutions based on knowledge gained during product development activities.
  • Solution Validation Testing: Validates configuration item changes or solutions using defined best practices.
  • Data Quality: Identifies, understands, and corrects flaws in data to support effective information governance.
  • Problem Solving: Solves problems using systematic analysis processes and industry-standard methodologies.
  • Values differences: Recognizing the value that different perspectives and cultures bring to an organization.

QUALIFICATIONS

Knowledge/Skills:

Must-Have:

  • 3-5 years of experience in data engineering with a strong background in Azure Databricks and Scala/Python.
  • Hands-on experience with Spark (Scala/PySpark) and SQL.
  • Experience with SPARK Streaming, SPARK Internals, and Query Optimization.
  • Proficiency in Azure Cloud Services.
  • Agile Development experience.
  • Unit Testing of ETL.
  • Experience creating ETL pipelines with ML model integration.
  • Knowledge of Big Data storage strategies (optimization and performance).
  • Critical problem-solving skills.
  • Basic understanding of Data Models (SQL/NoSQL) including Delta Lake or Lakehouse.
  • Quick learner.

Nice-to-Have:

  • Understanding of the ML lifecycle.
  • Exposure to Big Data open-source technologies.
  • Experience with SPARK, Scala/Java, Map-Reduce, Hive, HBase, and Kafka.
  • SQL query language proficiency.
  • Experience with clustered compute cloud-based implementations.
  • Familiarity with developing applications requiring large file movement for a cloud-based environment.
  • Exposure to Agile software development.
  • Experience building analytical solutions.
  • Exposure to IoT technology.

Experience:

  • Relevant experience preferred, such as working in temporary student employment, internships, co-ops, or other extracurricular team activities.
  • Knowledge of the latest technologies in data engineering is highly preferred, including:
  • Exposure to Big Data open source
  • SPARK, Scala/Java, Map-Reduce, Hive, Hbase, and Kafka or equivalent college coursework
  • SQL query language
  • Clustered compute cloud-based implementation experience
  • Familiarity with developing applications requiring large file movement for a cloud-based environment
  • Exposure to Agile software development
  • Exposure to building analytical solutions
  • Exposure to IoT technology

Work Schedule:

Most of the work will be with stakeholders in the US, with an overlap of 2-3 hours during EST hours on a need basis.

Job Systems/Information Technology

Organization Cummins Inc.

Role Category Remote

Job Type Exempt - Experienced

ReqID 2410605

Relocation Package No

Client-provided location(s): Pune, Maharashtra, India
Job ID: Cummins-R-B44A47B3940645C6A4DB00CE6DC6A0B9
Employment Type: Other

Perks and Benefits

  • Health and Wellness

    • FSA With Employer Contribution
    • Health Reimbursement Account
    • On-Site Gym
    • HSA With Employer Contribution
    • Health Insurance
    • Dental Insurance
    • Vision Insurance
    • Life Insurance
    • Short-Term Disability
    • Long-Term Disability
  • Parental Benefits

    • Non-Birth Parent or Paternity Leave
    • Birth Parent or Maternity Leave
  • Work Flexibility

    • Flexible Work Hours
    • Remote Work Opportunities
  • Office Life and Perks

    • Company Outings
    • Casual Dress
  • Vacation and Time Off

    • Leave of Absence
    • Personal/Sick Days
    • Paid Holidays
  • Financial and Retirement

    • Relocation Assistance
    • Performance Bonus
    • Stock Purchase Program
    • Pension
    • 401(K) With Company Matching
  • Professional Development

    • Mentor Program
    • Shadowing Opportunities
    • Access to Online Courses
    • Lunch and Learns
    • Tuition Reimbursement