Skip to main contentA logo with &quat;the muse&quat; in dark blue text.

Data Engineer/Integration Lead (AWS)

AT Infosys
Infosys

Data Engineer/Integration Lead (AWS)

Mexico City, Mexico

Required Qualifications:

  • 5+ years of experience in data engineering using Python with a focus on AWS S3, EMR, Glue, Step Functions, Apache NiFi and Spark.
  • Proven track record of building scalable data pipelines in cloud environments.
  • Proficiency in flow design, processors, and data provenance in Apache NiFi.
  • Strong expertise in Spark, Hadoop, and distributed computing on AWS EMR.
  • In-depth knowledge of AWS services (S3, Glue, Redshift, RDS, Lambda, Step Functions).
  • Experience with data formats (JSON, CSV, Parquet, Avro) and transformation techniques.
  • Strong problem-solving skills and ability to troubleshoot complex data processing issues.
  • Excellent communication skills with the ability to document and explain technical details clearly.
Preferred Qualifications:

Want more jobs like this?

Get jobs in Mexico City, Mexico delivered to your inbox every week.

By signing up, you agree to our Terms of Service & Privacy Policy.
  • AWS Certified Solutions Architect or Data Analytics Specialty.
  • Experience with data governance frameworks and compliance requirements.
  • Familiarity with CI/CD pipelines and version control (GitLab, Jenkins).

Key Responsibilities:
Design & Develop Data Pipelines:
  • Architect and implement end-to-end data pipelines using AWS S3, EMR, Glue, Step Functions, Apache NiFi, Spark.
  • Manage data ingestion processes from AWS S3, ensuring secure and efficient data transfer.
  • Implement initial data routing, validation, and transformations using Apache NiFi processors and Spark Data Engines
Data Processing & Transformation:
  • Integrate using AWS EMR, Apache NiFi, Spark to perform complex data transformations and analytics.
  • Optimize Spark jobs for processing large-scale datasets with a focus on performance and resource utilization.
  • Handle both historical and incremental data loads, ensuring data consistency and integrity.
Data Storage & Management:
  • Define and implement data storage strategies across S3, RDS, and Redshift, adhering to business requirements.
  • Manage data catalog creation and schema management using AWS Glue.
Automation & Orchestration:
  • Develop and manage workflows using Apache Airflow, AWS Step Functions to automate data processing tasks.
  • Implement monitoring, error handling, and retries within the orchestration framework.
Security & Compliance:
  • Ensure data security with encryption (AES-256, TLS) and IAM role-based access controls.
  • Implement data governance policies using AWS Glue Data Catalog to ensure compliance with regulatory requirements.
Performance Monitoring & Optimization:
  • Utilize AWS CloudWatch to monitor the performance of EMR clusters, NiFi flows and data storage.
  • Continuously optimize Spark job configurations and NiFi data flows for maximum throughput and minimal latency.

Client-provided location(s): Mexico City, CDMX, Mexico
Job ID: Infosys-122859BR
Employment Type: Other

Perks and Benefits

  • Health and Wellness

    • Health Insurance
    • Life Insurance
    • HSA
    • Short-Term Disability
  • Parental Benefits

    • Birth Parent or Maternity Leave
    • Non-Birth Parent or Paternity Leave
    • On-site/Nearby Childcare
  • Office Life and Perks

    • Commuter Benefits Program
  • Vacation and Time Off

    • Paid Vacation
    • Paid Holidays
    • Personal/Sick Days
    • Sabbatical
  • Financial and Retirement

    • 401(K)
    • Relocation Assistance
  • Professional Development

    • Learning and Development Stipend
  • Diversity and Inclusion

    • Employee Resource Groups (ERG)