Responsibilities
About The Team:
The mission of the TikTok Eng-AI Innovation Center is to explore cutting-edge AGI technologies, including but not limited to LLM, multi-modal LLM (video/image/audio/text/code), etc., to let machine better understand user creations on TikTok platform. Regarding to video/image/audio/text, enhanced content understanding can bring better user experience of searching, recommendation, and can more accurately identify and defend internet abuse and fraud on our platform. Regarding to code, our developed LLM aims to automatically re-organize/optimize TikTok codebase, and make the code/coding become more accessible for TikTok engineers.
Job Responsibilities
1. Participate in the design and implementation of a high-availability, scalable, distributed large-model machine learning platform to support the development and efficient iteration of large models for TikTok;
Want more jobs like this?
Get Software Engineering jobs in Singapore delivered to your inbox every week.
2. Explore cutting-edge technologies related to large-model engineering(LLMOps), covering areas such as data processing, model training, inference services, evaluation system, automated orchestration, prompt engineering, and resource scheduling;
3. Construct a high-performance, cost-effective large-model inference service architecture that ensures high service availability.;
4. Explore the application of frontier large models, Code Agents, etc;
Qualifications
Minimum Qualification
- Bachelor's degree or higher in Computer Science or related fields, with good communication and teamwork skills;
- Solid programming foundation, good coding style, familiar with multi-thread programming, distributed computing, network communication, memory management, and design patterns;
- Experience in engineering R&D or infrastructure, proficient in at least one of the following development languages: C/C++, Python, Golang; and experience researching and developing distributed systems, with the capability and experience in optimizing system performance.
- Experience in developing and deploying large-scale models, including data processing, training, deployment, and evaluation and DevOps/MLOps;
- Experience in large-model service deployment and optimization and distributed scheduling, computing, and storage projects like K8s, Ray, Hadoop, Spark, HDFS;
- Familiarity with deep learning frameworks such as TensorFlow, PyTorch, and understanding of large-model engineering frameworks like vllm/langchain.
Preferred Qualifications
- Passion for technology, good communication skills and team spirit