Skip to main contentA logo with &quat;the muse&quat; in dark blue text.

SRE Manager - OLAP Engine (Bytehouse)

AT TikTok
TikTok

SRE Manager - OLAP Engine (Bytehouse)

Singapore

Responsibilities

About TikTok
TikTok is the leading destination for short-form mobile video. Our mission is to inspire creativity and bring joy. TikTok has global offices including Los Angeles, New York, London, Paris, Berlin, Dubai, Singapore, Jakarta, Seoul, and Tokyo.

Why Join Us
Creation is the core of TikTok's purpose. Our platform is built to help imaginations thrive. This is doubly true of the teams that make TikTok possible.
Together, we inspire creativity and bring joy - a mission we all believe in and aim towards achieving every day.
To us, every challenge, no matter how difficult, is an opportunity; to learn, to innovate, and to grow as one team. Status quo? Never. Courage? Always.
At TikTok, we create together and grow together. That's how we drive impact - for ourselves, our company, and the communities we serve.

Want more jobs like this?

Get Software Engineering jobs in Singapore delivered to your inbox every week.

By signing up, you agree to our Terms of Service & Privacy Policy.

Join us.

About the team
TikTok and affiliate are developing the next-generation high-performance analytical database, with a mission to enable efficient and real-time data-driven decision-making on PB-level data sets. The initial product was forked from Clickhouse, after which large re-architecture had been taken place. The product now not only improves the efficiency of Clickhouse but also fits into the elastic cloud-native infrastructure with better scalability and resource utilization. With years of polishment in the internal EB-level scenarios, we are now ready to serve our business partners via various cloud vendors.

Our software engineers for product infrastructure role combine software and systems engineering disciplines to run high-performance, large-scale distributed infrastructure. This means you will be deeply involved in the developmental lifecycle of critical software services, collaborating closely with product engineers to combine software code and systems knowledge to ensure that cloud-native OLAP engines are reliable, fault-tolerant, efficiently scalable and cost-effective. You will also be leveraging your software engineering expertise to develop software platforms and tools to optimise the operational and engineering efficiencies of complex systems at scale, with particular focus on improving the systems' observability, performance and maintainability.

In this role, you will:

- Building and managing the Global SRE team, including team recruitment, new talent training, system operation/maintenance/coordination and team culture building.
- Improve the cross-team/time zone/regional cooperation mechanism, and provide SRE solutions in line with actual business scenarios based on business orientation.
- Responsible for SRE team arrangement and project management, guiding basic SRE work to be more effective, and improving the overall SRE efficiency.
- Develop process specifications and plans for compliant access, configuration, disaster recovery and fault handling of critical paths of overseas SRE services.
- Responsible for continuously improving the core SRE capabilities of OLAP engine in efficiency, cost, quality, security, etc.
- Develop automation, data visualization and automated monitoring processes to facilitate the optimization of the cloud-native OLAP engine infrastructure.
- Drive the design and engineering of tools, as well as platform solutions, to optimize product engineering and operation efficiencies.
- Manage oncall processes to respond to performance and reliability issues, and establish best practices for coordinating escalation to resolve issues and minimize downtime.

Qualifications

- Bachelor degree or above in Computer Science or a related technical discipline and good English communication skills.
- Familiar with SRE-related processes, understand the development trend of SRE technology in the industry, and have a good ability to build an SRE system, 6 years+ SRE experience, big-data or OLAP engine SRE experience is best to have .
- Familiar with SRE technologies, including Kubernetes, Terraform, Ansible, Bash Scripting etc.
- Familiar with cloud computing technologies of Amazon Web Services, Google Cloud Platform and other suppliers.
- Expertise in operations, deployment, and trouble shooting high availability and quality assurance of large-scale distributed systems, with a strong focus on stability and performance.
- Possesses a strong sense of responsibility, a proactive team spirit, and a strong ability to comprehensively analyze and solve problems.

TikTok is committed to creating an inclusive space where employees are valued for their skills, experiences, and unique perspectives. Our platform connects people from across the globe and so does our workplace. At TikTok, our mission is to inspire creativity and bring joy. To achieve that goal, we are committed to celebrating our diverse voices and to creating an environment that reflects the many communities we reach. We are passionate about this and hope you are too.

Client-provided location(s): Singapore
Job ID: TikTok-7295618768326592818
Employment Type: Other

Perks and Benefits

  • Health and Wellness

    • Health Insurance
    • Dental Insurance
    • Vision Insurance
    • HSA
    • Life Insurance
    • Fitness Subsidies
    • Short-Term Disability
    • Long-Term Disability
    • On-Site Gym
    • Mental Health Benefits
    • Virtual Fitness Classes
  • Parental Benefits

    • Fertility Benefits
    • Adoption Assistance Program
    • Family Support Resources
  • Work Flexibility

    • Flexible Work Hours
    • Hybrid Work Opportunities
  • Office Life and Perks

    • Casual Dress
    • Snacks
    • Pet-friendly Office
    • Happy Hours
    • Some Meals Provided
    • Company Outings
    • On-Site Cafeteria
    • Holiday Events
  • Vacation and Time Off

    • Paid Vacation
    • Paid Holidays
    • Personal/Sick Days
    • Leave of Absence
  • Financial and Retirement

    • 401(K) With Company Matching
    • Performance Bonus
    • Company Equity
  • Professional Development

    • Promote From Within
    • Access to Online Courses
    • Leadership Training Program
    • Associate or Rotational Training Program
    • Mentor Program
  • Diversity and Inclusion

    • Diversity, Equity, and Inclusion Program
    • Employee Resource Groups (ERG)

Company Videos

Hear directly from employees about what it is like to work at TikTok.