Minimum qualifications:
- Bachelors degree in Electrical Engineering, Computer Engineering, Computer Science, a related field, or equivalent practical experience.
- 3 years of experience with computer architecture concepts, with microarchitecture, cache hierarchy, pipelining, and memory subsystems.
- Master's degree or PhD in Electrical Engineering, Computer Engineering or Computer Science, with computer architecture.
- Experience with Machine Learning Accelerators (e.g., Machine Learning Software models or accelerator architectures).
- Experience in Machine Learning algorithms (e.g., recommendation systems, Natural Language Processing (NLP), image).
- Experience in architecting and optimizing compilers.
- Knowledge of compiler flows (e.g., TensorFlow).
Want more jobs like this?
Get jobs in Taipei, Taiwan delivered to your inbox every week.
About the job
Be part of a diverse team that pushes boundaries, developing custom silicon solutions that power the future of Google's direct-to-consumer products. You'll contribute to the innovation behind products loved by millions worldwide. Your expertise will shape the next generation of hardware experiences, delivering unparalleled performance, efficiency, and integration. Google's mission is to organize the world's information and make it universally accessible and useful. Our team combines the best of Google AI, Software, and Hardware to create radically helpful experiences. We research, design, and develop new technologies and hardware to make computing faster, seamless, and more powerful. We aim to make people's lives better through technology.
Responsibilities
- Build up tools, flows, and dashboards for Tensor Processing Unit (TPU) power/performance analysis.
- Analyze important Machine Learning workloads, evaluate power and performance and propose architecture or compiler improvements.
- Analyze micro-architecture of the TPU, engage with the implementation team and propose power or performance optimization opportunities.
- Collaborate with cross-functional teams to improve the end to end workload analysis flows, including debugging and tracing.