Skip to main contentA logo with &quat;the muse&quat; in dark blue text.

Data for LLMs - Software Engineer Intern: 2025

AT IBM
IBM

Data for LLMs - Software Engineer Intern: 2025

Albany, NY

Introduction
Want to be a part of preparing and governing data for IBM's Granite models? We are a group of scientists, engineers and designers working on the state-of-the-art Data and Model Factory that produces all of IBM's Granite models. Our work enables and accelerates the entire data pipeline, from data clearance and acquisition to quality-focused data engineering. These data are used in pre-training, fine-tuning, instruction-tuning, or RAG solutions powered by IBM Granite. We thrive in opensource innovation, responsible use of data and AI, collaboration across disciplines, including backend engineering, data science, distributed computing, natural language processing, among others.

Your Role and Responsibilities
This is for a 2025 summer internship with the following start dates: May - August or June - September for quarter system schools.

Want more jobs like this?

Get Software Engineering jobs delivered to your inbox every week.

Select a location
By signing up, you agree to our Terms of Service & Privacy Policy.


During your internship, you can expect to work on challenging engineering problems, often involving large-scale data and models, and produce cutting edge technology in a diverse and nurturing research environment. You'll learn and practice how to define problems, build prototypes, test hypotheses, and deploy results. In the past, interns have contributed to open-source projects, built functioning systems and prototypes, and published their results as papers or patents.

Required Technical and Professional Expertise

  • Applicants should be enrolled in a Bachelor-level course and have a science, technology, engineering, or mathematical discipline background.
  • Programming languages such as Python, Spark, Ray, and C++.
  • Your areas of experience should include foundation models, machine learning, AI, natural language processing, data engineering, cloud computing, database management, and other computer science and engineering topics.

Preferred Technical and Professional Expertise

  • Your areas of interest should include foundation models, machine learning, AI, natural language processing, data engineering, cloud computing, database management, and other computer science and engineering topics.

Client-provided location(s): Albany, NY, USA; San Jose, CA, USA; Cambridge, MA, USA; Yorktown Heights, NY 10598, USA
Job ID: IBM-21085769
Employment Type: Intern

Company Videos

Hear directly from employees about what it is like to work at IBM.