Introduction
At IBM, work is more than a job - it's a calling: To build. To design. To code. To consult. To think along with clients and sell. To make markets. To invent. To collaborate. Not just to do something better, but to attempt things you've never thought possible. Are you ready to lead in this new era of technology and solve some of the world's most challenging problems? If so, lets talk.
Your Role and Responsibilities
As a Data Engineer to join our dynamic, global team. This role requires strong expertise in Spark, databases, and cloud technologies to design, implement, and optimize scalable data solutions that support business requirements. The ideal candidate will have experience with data pipelines, database management, and testing methodologies in a distributed, fast-paced environment.
Want more jobs like this?
Get jobs in Hyderabad, India delivered to your inbox every week.
Responsibilities:
- Design, develop, and optimize scalable data pipelines and architectures using Apache Spark and databases.
- Work with cloud platforms (IBM Cloud) to build robust data solutions and ensure seamless integration.
- Collaborate closely with cross-functional teams to gather and understand business requirements and translate them into efficient data models and workflows.
- Implement and maintain data structures, ensuring the integrity, availability, and scalability of the data.
- Develop and implement testing strategies for data validation, ensuring high data quality and system performance.
- Support the team in data analysis and insights to enable data-driven decision-making.
- Contribute to documentation and best practices to facilitate knowledge sharing within the global team.
- Participate in regular syncs with the worldwide team, ensuring clear communication and alignment across time zones.
- Develop elegant, flexible, maintainable, and scalable solutions to complex problems by delivering data and analytics.
- Demonstrate experience leading development efforts and coordinating work for junior developers.
- Define and consolidate development and modelling practices within a team.
- Leverage strong collaborative skills to define requirements, develop plans, and deliver iteratively to support our organization's mission.
- Leverage strong analytical skills to understand business processes and propose effective data solutions.
- Communicate and manage relationships with business and development teams - provide guidance, mentorship, and direction, as required.
- Translate business needs into technical requirements.
- Learn new tools, technologies, and processes for continuous improvement.
- As part of a self-directed team, take ownership of activities and deliver them on-time and with quality.
Required Technical and Professional Expertise
Above all, we value curiosity, teamwork, and a desire to learn. We are confident that if you possess the right attitude, work ethic, and skill set that you could succeed in the role, even if you do not meet every one of the requirements below.
- 5+ years of experience in Data Engineering with Big Data.
- Comfortable multi-tasking and working as part of a global team, as well as providing technical leadership and taking ownership.
- Adaptive to ambiguity and willing to change in a fast-paced environment.
- Advanced proficiency in Python, Scala, and PySpark.
- Advanced experience with SQL for complex queries and data manipulation.
- Expertise in developing and maintaining data pipelines using Apache Spark and Scala.
- Experience making continuous documentation improvements and maintaining clear and concise technical documentation.
- Strong skills in data modeling, including designing complex, dimensional data models.
- Strong understanding of Coud platforms, particularly IBM Cloud and Cloud Object Storage.
- Strong understanding and application of clean code principles and best development practices.
- Proficiency in data validation, testing, and ensuring data quality.
- Proficiency in creating and managing workflows with Apache Airflow and Argo Workflows.
- Experience in test-driven development (TDD) and continuous improvement of code quality.
- Familiarity with CI/CD tools like Tekton/Jenkis plus experience in setting up and maintaining CI/CD pipelines.
- Experience in data engineering or business intelligence roles contributing to a shared codebase.
- Experience with data structures, algorithms, software design and writing software in Python, Scala, Java, or similar.
- Experience with relational databases including writing and optimizing SQL queries and designing schema.
- Experience using continuous integration and deployment systems (e.g., Cloud Build, GitLab, Jenkins).
- Experience with OpenShift / Kubernetes.
Preferred Technical and Professional Expertise
- Experience in one or more of the following: data protection, disaster recovery, storage, Relational Databases (Postgres, MYSQL), No-SQL Databases (Cloudant, MongoDB), messaging queues (Kafka) and K-V stores (Redis) and storage technologies (block/file/object/CSI).
- Experience writing unit tests (e.g., PyTest, Selenium, JUnit).
- Basic understanding of machine learning