Responsibilities
Team Introduction
Our team plays a crucial role in the data ecosystem of the TikTok Recommendation System, focusing on creating offline and real-time data storage solutions for large-scale recommendation, search, and advertising businesses, serving over 1 billion users. The core goals of the team are to ensure high system reliability, uninterrupted service, and smooth data processing. We are committed to building a storage and computing infrastructure that can adapt to various data sources and meet diverse storage requirements, ultimately providing efficient, cost-effective, and user-friendly data storage and management tools for the business.
Responsibilities
1. Architecture Design and Implementation: Design and implement offline and real-time data architectures for large-scale recommendation, search, and advertising systems based on Paimon and Flink. Ensure efficient data processing and storage to meet the strict requirements of the business for data timeliness and accuracy.
Want more jobs like this?
Get Data and Analytics jobs in San Jose, CA delivered to your inbox every week.
2. System Construction and Optimization: Design and implement flexible, scalable, stable, and high-performance storage systems and computing models. Use Paimon as the storage foundation and combine it with the powerful computing capabilities of Flink. Continuously optimize system performance to cope with the challenges brought by business growth.
3. Troubleshooting and Stability Assurance: Be responsible for troubleshooting production systems. For problems that occur in the Paimon-Flink architecture during operation, design and implement necessary mechanisms and tools, such as data consistency assurance and exception recovery, to ensure the overall stability of the production system.
4. Distributed System Construction: Build industry-leading distributed systems, including offline and online storage based on Paimon and batch and stream processing frameworks based on Flink, providing solid and reliable infrastructure support for massive data and large-scale business systems.
Qualifications
Minimum Qualifications:
1. A bachelor's degree or above in computer science, software engineering, or related fields, with more than 2 years of experience in building scalable systems.
2. Technical Skills:
- Paimon - Flink Technology Stack: Have a thorough understanding of Paimon and Flink, and be able to understand and use them at the source-code level. Experience in customizing or extending these two systems is preferred.
- Data Lake Technology: Have an in-depth understanding of at least one data lake technology (such as Paimon), with practical implementation and customization experience, which should be highlighted in the resume.
- Storage Knowledge: Be familiar with the principles of HDFS, and knowledge of columnar storage formats such as Parquet and ORC is preferred.
- Programming Languages: Be proficient in programming languages such as Java, C++, and Scala, with strong coding and problem-solving abilities.
3. Project Experience: Have experience in data warehouse modeling and be able to design efficient data models that meet complex business scenarios.
- Experience in using other big-data systems/frameworks (such as Hive, HBase, Kudu, etc.) is preferred.
4. Comprehensive Qualities: Have the courage to take on complex problems and be willing to explore problems without clear solutions.
5. Be passionate about learning new technologies and be able to quickly master and apply them to practical work.
6. Experience in handling large-scale data (PB - level and above) is preferred.