System Debug Lead Engineer

Apply Now

Company: NVIDIA Corporation

Location: Santa Clara, CA 95051

Description:

NVIDIA has been transforming computer graphics, PC gaming, and accelerated computing for more than 25 years. It's a unique legacy of innovation that's fueled by great technology-and amazing people. Today, we're tapping into the unlimited potential of AI to define the next era of computing. An era in which our GPU acts as the brains of computers, robots, and self-driving cars that can understand the world. Doing what's never been done before takes vision, innovation, and the world's best talent. As an NVIDIAN, you'll be immersed in a diverse, supportive environment where everyone is inspired to do their best work. Come join the team and see how you can make a lasting impact on the world.

We are seeking a highly skilled and experienced System Debug Lead Engineer to join our team. This individual will play a critical role in root cause analysis and resolution of complex issues across our server and data center platforms, involving electrical, mechanical, thermal, and firmware components. The ideal candidate brings a multidisciplinary approach to system-level problem-solving and thrives in a dynamic, collaborative environment.

What you'll be doing:
  • Lead cross-functional debug of critical system-level issues across hardware (electrical, mechanical, thermal) and firmware domains
  • Perform hands-on root cause analysis of failures observed during system validation, production, or in-field operation
  • Collaborate with electrical engineers, mechanical engineers, thermal engineers, firmware developers, and manufacturing teams to develop and validate fixes
  • Develop and implement detailed tests and experiments to identify and recreate intricate issues
  • Develop tools, scripts, and automation to accelerate debug workflows
  • Provide technical guidance and mentoring to cross-functional teams
  • Document debug procedures, findings, and mitigation plans clearly and concisely
  • Drive corrective and preventive action processes with suppliers and internal teams


What we need to see:
  • Bachelor's or Master's degree in Electrical Engineering, Computer Engineering, or related fields (or equivalent experience)
  • 10+ years of experience in system-level debug in data center/server or similar hardware platforms
  • Deep understanding of system architecture including CPUs, memory, high-speed I/Os, power delivery, thermal management, and firmware
  • Proven experience in debugging electrical issues (e.g., signal integrity, power integrity), thermal issues (e.g., airflow, cooling design), and firmware anomalies (e.g., BIOS/UEFI, BMC)
  • Proficiency in tools such as oscilloscopes, logic analyzers, and firmware debug equipment/utilities
  • Strong communication skills and the ability to work across fields and teams
  • Experience working with contract manufacturers and suppliers is a plus


NVIDIA offers competitive salaries and benefits. We have experienced and talented individuals. Our engineering teams are growing quickly due to outstanding growth. If you're a creative engineer passionate about technology, we welcome your application!

The base salary range is 160,000 USD - 304,750 USD. Your base salary will be determined based on your location, experience, and the pay of employees in similar positions.

You will also be eligible for equity and benefits. NVIDIA accepts applications on an ongoing basis.

NVIDIA is committed to fostering a diverse work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.

Similar Jobs