Job title

Data Engineer

Reports to

Platform Product Owner

Position Location

Bangalore - India

About our Company

Schneider Electric is the global specialist in energy management and automation. With revenues of ~€25 billion in FY2018, our 144,000+ employees serve customers in over 100 countries, helping them to manage their energy and process in ways that are safe, reliable, efficient and sustainable. From the simplest of switches to complex operational systems, our technology, software and services improve the way our customers manage and automate their operations. Our connected technologies reshape industries, transform cities and enrich lives.

Want more jobs like this?

Get jobs in Bangalore, India delivered to your inbox every week.

By signing up, you agree to our Terms of Service & Privacy Policy.

At Schneider Electric, we call this Life Is On

Job purpose

At GTS-Group Data Platforms of Schneider Electric, we are building Intel Data Store (IntelDS), a global Data Lake for enterprise data. It is a Big Data platform fully hosted on AWS and connected today to more than 40 data sources .

The job purpose is to support the big data engineering team building and improving IntelDS by:

• Connecting new sources to enrich the data scope of the platform.

• Design and develop new features based on consumer application requests to ingest data in the different layers of IntelDS.

• Automate the integration and delivery of data objects and data pipelines.

Direct reports

• Not applicable

Duties and responsibilities

The duties and responsibilities of this job are to prepare data and make it available in an efficient and optimized format for our different data consumers, ranging from BI and analytics to data science applications . It requires to work with current technologies used by IntelDS and in particular Apache Spark, Lambda & Step Functions, Glue Data Catalog, and RedShift on AWS environment. This includes:

Design and develop new data ingestion patterns into IntelDS Raw and/or Unified data layers based on the requirements and needs for connecting new data sources or for building new data objects. Working in ingestion patterns allow to automate the data pipelines.
Participate to and apply DevSecOps practices by automating the integration and delivery of data pipelines in a cloud environment. This can include the design and implementation of end-to-end data integration tests and/or CICD pipelines.
Analyse existing data models, identify and implement performance optimizations for data ingestion and data consumption. The objective is to accelerate data availability within the platform and to consumer applications.
Support client applications in connecting and consuming data from the platform, and ensure they follow our guidelines and best practices.
Participate in the monitoring of the platform and debugging of detected issues and bugs.

Qualifications

Minimum of 3 years prior experience as data engineer with proven experience on Big Data and Data Lakes on a cloud environment.

Bachelor or Master degree in computer science or applied mathematics (or equivalent).

Qualifications include:

Proven experience working with data pipelines / ETL / BI regardless of the technology.
Proven experience working with AWS including at least 3 of: RedShift, S3, EMR, Cloud Formation, DynamoDB, RDS, lambda.
Big Data technologies and distributed systems: one of Spark, Presto or Hive.
Python language: scripting and object oriented.
Fluency in SQL for datawarehousing (RedShift in particular is a plus).
Familiar with GIT, Linux, CI/CD pipelines is a plus.
Strong systems/process orientation with demonstrated analytical thinking, organization skills and problem-solving skills.
Ability to self-manage, prioritize and execute tasks in a demanding environment.
Strong consultancy orientation and experience, with the ability to form collaborative, productive working relationships across diverse teams and cultures is a must.
Willingness and ability to train and teach others.
Ability to facilitate meetings and follow up with resulting action items.

Travel %

• Not applicable

About Schneider Electric

Schneider Electric is leading the Digital Transformation of Energy Management and Automation in Homes, Buildings, Data Centers, Infrastructure and Industries.

With global presence in over 100 countries, Schneider is the undisputable leader in Power Management - Medium Voltage, Low Voltage and Secure Power , and in Automation Systems. We provide integrated efficiency solutions, combining energy, automation and software.

In our global Ecosystem, we collaborate with the largest Partner, Integrator and Developer Community on our Open Platform to deliver real-time control and operational efficiency.

We believe that great people and partners make Schneider a great company and that our commitment to Innovation, Diversity and Sustainability ensures that Life Is On everywhere, for everyone and at every moment. www.schneider-electric.com

Qualifications

Job title

Data Engineer

Reports to

Platform Product Owner

Position Location

Bangalore - India

About our Company

Schneider Electric is the global specialist in energy management and automation. With revenues of ~€25 billion in FY2018, our 144,000+ employees serve customers in over 100 countries, helping them to manage their energy and process in ways that are safe, reliable, efficient and sustainable. From the simplest of switches to complex operational systems, our technology, software and services improve the way our customers manage and automate their operations. Our connected technologies reshape industries, transform cities and enrich lives.

At Schneider Electric, we call this Life Is On

Job purpose

At GTS-Group Data Platforms of Schneider Electric, we are building Intel Data Store (IntelDS), a global Data Lake for enterprise data. It is a Big Data platform fully hosted on AWS and connected today to more than 40 data sources .

The job purpose is to support the big data engineering team building and improving IntelDS by:

• Connecting new sources to enrich the data scope of the platform.

• Design and develop new features based on consumer application requests to ingest data in the different layers of IntelDS.

• Automate the integration and delivery of data objects and data pipelines.

Direct reports

• Not applicable

Duties and responsibilities

The duties and responsibilities of this job are to prepare data and make it available in an efficient and optimized format for our different data consumers, ranging from BI and analytics to data science applications . It requires to work with current technologies used by IntelDS and in particular Apache Spark, Lambda & Step Functions, Glue Data Catalog, and RedShift on AWS environment. This includes:

Design and develop new data ingestion patterns into IntelDS Raw and/or Unified data layers based on the requirements and needs for connecting new data sources or for building new data objects. Working in ingestion patterns allow to automate the data pipelines.
Participate to and apply DevSecOps practices by automating the integration and delivery of data pipelines in a cloud environment. This can include the design and implementation of end-to-end data integration tests and/or CICD pipelines.
Analyse existing data models, identify and implement performance optimizations for data ingestion and data consumption. The objective is to accelerate data availability within the platform and to consumer applications.
Support client applications in connecting and consuming data from the platform, and ensure they follow our guidelines and best practices.
Participate in the monitoring of the platform and debugging of detected issues and bugs.

Proven experience working with data pipelines / ETL / BI regardless of the technology.
Proven experience working with AWS including at least 3 of: RedShift, S3, EMR, Cloud Formation, DynamoDB, RDS, lambda.
Big Data technologies and distributed systems: one of Spark, Presto or Hive.
Python language: scripting and object oriented.
Fluency in SQL for datawarehousing (RedShift in particular is a plus).
Familiar with GIT, Linux, CI/CD pipelines is a plus.
Strong systems/process orientation with demonstrated analytical thinking, organization skills and problem-solving skills.
Ability to self-manage, prioritize and execute tasks in a demanding environment.
Strong consultancy orientation and experience, with the ability to form collaborative, productive working relationships across diverse teams and cultures is a must.
Willingness and ability to train and teach others.
Ability to facilitate meetings and follow up with resulting action items.

Expert - Data Engineer

Expert - Data Engineer

Want more jobs like this?

Search Additional Jobs