Role: Lead position with Primary skillsets on AWS services with experience on EC3, S3, Redshift, RDS, AWS Glue/EMR, Python , PySpark, SQL, Airflow, Visualization tools & Databricks.
Responsibilities:
- Design and implement the data modeling, data ingestion and data processing for various datasets
- Design, develop and maintain ETL Framework for various new data source
- Ability to migrate the existing Talend ETL workflow into new ETL framework using AWS Glue/ EMR, PySpark and/or data pipeline using python.
- Build orchestration workflow using Airflow
- Develop and execute adhoc data ingestion to support business analytics.
- Proactively interact with vendors for any questions and report the status accordingly
- Explore and evaluate the tools/service to support business requirement
- Ability to learn to create a data-driven culture and impactful data strategies.
- Aptitude towards learning new technologies and solving complex problem.
- Connect with Customer to get the requirement and ensure the timely delivery.
Want more jobs like this?
Get jobs in Bangalore, India delivered to your inbox every week.
Qualifications:
- Minimum of bachelor's degree. Preferably in Computer Science, Information system, Information technology.
- Minimum 8+ years of experience on cloud platforms such as AWS, Azure, GCP.
- Minimum 8+ year of experience in Amazon Web Services like VPC, S3, EC3, Redshift, RDS, EMR, Athena, IAM, Glue, DMS, Data pipeline & API, Lambda, etc.
- Minimum of 8+ years of experience in ETL and data engineering using Python, AWS Glue, AWS EMR /PySpark, Talend and Airflow for orchestration.
- Minimum 8+ years of experience in SQL, Python, and source control such as Bitbucket, CICD for code deployment.
- Experience in PostgreSQL, SQL Server, MySQL & Oracle databases.
- Experience in MPP such as AWS Redshift and EMR.
- Experience in distributed programming with Python, Unix Scripting, MPP, RDBMS databases for data integration
- Experience building distributed high-performance systems using Spark/PySpark, AWS Glue and developing applications for loading/streaming data into databases, Redshift.
- Experience in Agile methodology
- Proven skills to write technical specifications for data extraction and good quality code.
- Experience with big data processing techniques using Sqoop, Spark, hive is additional plus
- Experience in analytic visualization tools.
- Design of data solutions on Databricks including delta lake, data warehouse, data marts and other data solutions to support the analytics needs of the organization.
- Should be an individual contributor with experience in above mentioned technologies
- Should be able to lead the offshore team and can ensure on time delivery, code review and work management among the team members.
- Should have experience in customer communication.