Skip to main contentA logo with &quat;the muse&quat; in dark blue text.

SoC RAS Design Tech Lead, Machine Learning Accelerators

AT Google
Google

SoC RAS Design Tech Lead, Machine Learning Accelerators

Sunnyvale, CA

Minimum qualifications:

  • Bachelor's degree in Electrical Engineering, Computer Engineering, Computer Science, a related field, or equivalent practical experience.
  • 10 years of experience with industry-standard tools, languages and methodologies relevant to the development of silicon-based ICs and chips.
  • 3 years of experience working with system and hardware teams in defining the RAS requirements and architecture.
  • Experience in computer architecture, logic design and leading block or subsystem level RTL development.
Preferred qualifications:
  • Master's degree or PhD in Electrical Engineering, Computer Engineering or Computer Science, with an emphasis on computer architecture, or a related field.

Want more jobs like this?

Get jobs in Sunnyvale, CA delivered to your inbox every week.

By signing up, you agree to our Terms of Service & Privacy Policy.

  • 12 years of experience in SOC architecture and design, including 6 years of experience architecting and designing RAS features.
  • Experience of SOC subsystem level logic redundancy design and test architecture.
  • Understanding of circuit level SER (Soft Error Rate) modeling, measurement and mitigation techniques.
  • Understanding of error coding techniques and design experience of ECC implementations.
  • Understanding of SDC, DUE and DCE, and associated metrics, analysis and calculations.

  • About the job

    Be part of a diverse team that pushes boundaries, developing custom silicon solutions that power the future of Google's direct-to-consumer products. You'll contribute to the innovation behind products loved by millions worldwide. Your expertise will shape the next generation of hardware experiences, delivering unparalleled performance, efficiency, and integration.

    In this role, you will join a team working on building SOC design for our data center accelerators. As a RAS (Reliability, Availability, Serviceability) SOC Design Technical Lead, you will own and lead the requirement definition, architecture, microarchitecture and the development of the SOC RAS features. This is a highly cross-functional role that requires a high-level of coordination and co-design with our platform and system hardware counterparts. You will have experience in RAS, computer architecture and logic design, and have a propensity for leading multi-faceted efforts involving many stakeholders.

    Behind everything our users see online is the architecture built by the Technical Infrastructure team to keep it running. From developing and maintaining our data centers to building the next generation of Google platforms, we make Google's product portfolio possible. We're proud to be our engineers' engineers and love voiding warranties by taking things apart so we can rebuild them. We keep our networks up and running, ensuring our users have the best and fastest experience possible.

    The US base salary range for this full-time position is $221,000-$314,000 + bonus + equity + benefits. Our salary ranges are determined by role, level, and location. The range displayed on each job posting reflects the minimum and maximum target salaries for the position across all US locations. Within the range, individual pay is determined by work location and additional factors, including job-related skills, experience, and relevant education or training. Your recruiter can share more about the specific salary range for your preferred location during the hiring process.

    Please note that the compensation details listed in US role postings reflect the base salary only, and do not include bonus, equity, or benefits. Learn more about benefits at Google .

    Responsibilities

    • Define the architecture and microarchitecture of RAS features of TPU SOCs.
    • Lead the design and implementation of the RAS features.
    • Collaborate with Platform team and co-design the SOC level RAS requirements.
    • Be responsible for setting the DCE (Detectable and Correctable Errors), DUE (Detected but Unrecoverable Errors) and SDC (Silent Data Corruption) goals, DPPM goals for TPUs.

    Client-provided location(s): Sunnyvale, CA, USA
    Job ID: Google-103048070085649094
    Employment Type: Full Time

    Perks and Benefits

    • Health and Wellness

      • Health Insurance
      • Dental Insurance
      • Vision Insurance
      • Life Insurance
      • Short-Term Disability
      • Long-Term Disability
      • FSA
      • HSA
      • Fitness Subsidies
      • On-Site Gym
      • Mental Health Benefits
      • Health Reimbursement Account
      • HSA With Employer Contribution
    • Parental Benefits

      • Birth Parent or Maternity Leave
      • Non-Birth Parent or Paternity Leave
      • Fertility Benefits
      • Adoption Assistance Program
      • Family Support Resources
      • Adoption Leave
    • Work Flexibility

      • Hybrid Work Opportunities
    • Office Life and Perks

      • Commuter Benefits Program
      • Casual Dress
      • Pet-friendly Office
      • Snacks
      • Some Meals Provided
      • On-Site Cafeteria
    • Vacation and Time Off

      • Paid Vacation
      • Paid Holidays
      • Personal/Sick Days
      • Leave of Absence
      • Volunteer Time Off
    • Financial and Retirement

      • 401(K) With Company Matching
      • Company Equity
      • Performance Bonus
      • Financial Counseling
    • Professional Development

      • Tuition Reimbursement
      • Internship Program
      • Learning and Development Stipend
    • Diversity and Inclusion

      • Employee Resource Groups (ERG)

    Company Videos

    Hear directly from employees about what it is like to work at Google.