Introduction
Want to be a part of preparing and governing data for IBM's Granite models? We are a group of scientists, engineers and designers working on the state-of-the-art Data and Model Factory that produces all of IBM's Granite models. Our work enables and accelerates the entire data pipeline, from data clearance and acquisition to engineering. These data are used in pre-training, fine-tuning, instruction-tuning, or RAG solutions powered by IBM Granite. We thrive in opensource innovation, responsible use of data and AI, collaboration across disciplines, including backend engineering, data science, distributed computing, natural language processing, among others.
Your Role and Responsibilities
This is for a 2025 summer internship with the following start dates: May - August or June - September for quarter system schools.
Want more jobs like this?
Get Science and Engineering jobs delivered to your inbox every week.
Your responsibilities include:
- Conduct cutting-edge research on LLM models
- Design and develop new architectures, algorithms, and approaches to improve language models.
- Collaborate with cross-functional teams (engineering, product) to integrate models into production systems.
- Publish research in top-tier conferences like NeurIPS, ICML, and ACL.
Required Technical and Professional Expertise
- Applicants should be enrolled in a Master's or PhD in Computer Science, Machine Learning, or related fields.
- Expertise in NLP, deep learning, transformer architectures.
- Strong experience with PyTorch or TensorFlow.
- Proven record of publications in top machine learning or AI conferences.
Preferred Technical and Professional Expertise
- Your areas of interest should include foundation models, machine learning, AI, natural language processing, data engineering, cloud computing, database management, and other computer science and engineering topics.