Responsibilities
TikTok is the leading destination for short-form mobile video. At TikTok, our mission is to inspire creativity and bring joy. TikTok's global headquarters are in Los Angeles and Singapore, and its offices include New York, London, Dublin, Paris, Berlin, Dubai, Jakarta, Seoul, and Tokyo.
Why Join Us
Creation is the core of TikTok's purpose. Our platform is built to help imaginations thrive. This is doubly true of the teams that make TikTok possible.
Together, we inspire creativity and bring joy - a mission we all believe in and aim towards achieving every day.
To us, every challenge, no matter how difficult, is an opportunity; to learn, to innovate, and to grow as one team. Status quo? Never. Courage? Always.
Want more jobs like this?
Get jobs in San Jose, CA delivered to your inbox every week.
At TikTok, we create together and grow together. That's how we drive impact - for ourselves, our company, and the communities we serve.
Join us.
Unlocking the secrets of ByteDance's global tech empire, the Data Systems Infrastructure (DSI) team stands as the unseen architects behind the scenes. In a thrilling dance of technology and innovation, we propel the company's meteoric rise by constructing and orchestrating colossal data fortresses, taming the life cycle of server fleets, conjuring Cloud solutions, and crafting a symphony of infrastructure services. Our mission is to ensure scalability and unwavering reliability, making sure ByteDance's digital footprint leaves an indelible mark on the world.
Embark on an exciting expedition to explore the rapidly expanding ByteDance domain in the United States, Europe, and Asia. Here, the Data Systems Infrastructure (DSI) team is crafting monumental data citadels that encircle the planet, sheltering legions of hundreds of thousands of servers. As the maestro of our production systems, you will embark on a captivating odyssey, taming the life cycles of these servers. Your adventure will begin with the orchestration of their initial deployment, navigating the intricate terrain of OS installation, summoning services like a digital magician, and maintaining vigilant watch over our inventory. But, like any epic tale, there will be times of challenge when you become a troubleshooter extraordinaire, mending and restoring with unwavering dedication. Eventually, you'll guide them into the sunset, orchestrating their decommissioning and ensuring their rebirth through recycling, all while contributing to the pulsating rhythm of ByteDance's technological evolution.
We are looking for talented individuals to join our team in 2025. As a graduate, you will get unparalleled opportunities for you to kickstart your career, pursue bold ideas and explore limitless growth opportunities. Co-create a future driven by your inspiration with TikTok.
Successful candidates must be able to commit to an onboarding date by end of year 2025.
Applications will be reviewed on a rolling basis. We encourage you to apply as early as possible.
Candidates can apply to a maximum of two positions and will be considered for jobs in the order you apply. The application limit is applicable to TikTok and its affiliates' jobs globally. Applications will be reviewed on a rolling basis - we encourage you to apply early.
Responsibilities:
- Operation: As a Production Systems Engineer, your task is to contribute to improving the quality, reliability, efficiency, effectiveness, and scalability of our data center operations, platform, and service globally.
- Lifecycle Improvement: Engage in and enhance the entire lifecycle of Infrastructure systems - from system design to launch, deployment, operation, and refinement.
- Monitoring: Provide tools and solutions to enhance monitor availability, latency, and the overall health of services, servers, infrastructure, and networks. Collect, store, analyze, or present the relevant data.
- Automation: Provide tools and solutions to enhance the automation, reliability, scalability, and operability of services.
- Incident Response and Disaster Recovery: Participate in our on-call across regions and incident response teams to solve critical problems in production. Troubleshoot and resolve critical technical issues in a high-pressure, time-sensitive setting. Carry out high-level root-cause analysis for service disruptions and establish precautionary measures. Implement sustainable incident response and postmortem procedures.
- Cross-team Collaboration: Partner with stakeholders such as infrastructure architects, project managers, data center operations engineers, platform developers, supply chain teams, and our internal customers to understand overarching business objectives. You will also have the opportunity to design and implement innovative solutions for our Core IDCs, CDN/Edge, and Cloud Services.
- Technical Documentation: Create and maintain standard operating procedures and knowledge bases.
Qualifications
Minimum Qualifications:
- Education: Bachelor's degree in Computer Science, Computer Engineering, relevant technical field, or equivalent practical experience.
- Experience: Have a good understanding of computer science and basic network technology. Be proficient in configuring and troubleshooting common open-source components, and possess strong problem-solving capabilities.
- Server/Computer Hardware: Intermediate understanding of server/computer architecture, hardware evaluation, validation, diagnostics, and break-fix procedures.
- Linux: Proficiency in Linux, possessing knowledge of basic commands, system management, and an understanding of scripts.
- Coding: Fluency in at least one industry-standard language (e.g., Shell/Bash, Python, Golang, Java, JavaScript, C++) for scripting and automation.
- Soft Skills: A strong sense of responsibility, meticulous and careful, full of enthusiasm for work and a good team spirit; strong learning ability, good at thinking and summarizing, and possess excellent documentation skills.
Preferred Qualification:
- Data Center: Familiarity with data center server operations, including OS installations, break-fix processes, and involvement in high-impact projects such as planning and operations for new design-build facilities or renovations of existing systems.
- Monitoring: Experience with orchestrating tools and designs for monitoring server health, network switches, power, and temperature conditions within the data center.
- Network: Understanding of network operations and infrastructure, including knowledge of TCP/IP protocols, routing, switching, and troubleshooting network issues.
- Automation: Proven experience in automation, demonstrated by at least one project focused on optimizing processes.
- Agile Methodologies: Experience with Agile methodologies (e.g., Kanban, Scrum, Jira), including user stories, sprint planning, and backlog management.
TikTok is committed to creating an inclusive space where employees are valued for their skills, experiences, and unique perspectives. Our platform connects people from across the globe and so does our workplace. At TikTok, our mission is to inspire creativity and bring joy. To achieve that goal, we are committed to celebrating our diverse voices and to creating an environment that reflects the many communities we reach. We are passionate about this and hope you are too.
TikTok is committed to providing reasonable accommodations in our recruitment processes for candidates with disabilities, pregnancy, sincerely held religious beliefs or other reasons protected by applicable laws. If you need assistance or a reasonable accommodation, please reach out to us at https://shorturl.at/cdpT2
Job Information
[For Pay Transparency] Compensation Description (annually)
The base salary range for this position in the selected city is $71820 - $104400 annually.
Compensation may vary outside of this range depending on a number of factors, including a candidate's qualifications, skills, competencies and experience, and location. Base pay is one part of the Total Package that is provided to compensate and recognize employees for their work, and this role may be eligible for additional discretionary bonuses/incentives, and restricted stock units.
Benefits may vary depending on the nature of employment and the country work location. Employees have day one access to medical, dental, and vision insurance, a 401(k) savings plan with company match, paid parental leave, short-term and long-term disability coverage, life insurance, wellbeing benefits, among others. Employees also receive 10 paid holidays per year, 10 paid sick days per year and 17 days of Paid Personal Time (prorated upon hire with increasing accruals by tenure).
The Company reserves the right to modify or change these benefits programs at any time, with or without notice.
For Los Angeles County (unincorporated) Candidates:
Qualified applicants with arrest or conviction records will be considered for employment in accordance with all federal, state, and local laws including the Los Angeles County Fair Chance Ordinance for Employers and the California Fair Chance Act. Our company believes that criminal history may have a direct, adverse and negative relationship on the following job duties, potentially resulting in the withdrawal of the conditional offer of employment:
1. Interacting and occasionally having unsupervised contact with internal/external clients and/or colleagues;
2. Appropriately handling and managing confidential information including proprietary and trade secret information and access to information technology systems; and
3. Exercising sound judgment.