Urgently hiring Use left and right arrow keys to navigate
Based on similar jobs in your market
Estimated Pay info$64 per hour
Hours Full-time
Location Mountain View, CA
Mountain View, California open_in_new

About this job

Job Description

Job Description
About the role:
Own the reliability of the advanced packages and systems that turn our AI accelerator silicon into products that survive years in the field. You'll define how we qualify 2.5D/3D and heterogeneously-integrated packages, model their physics of failure, drive root-cause when things fail, and build the reliability engineering that lets us predict lifetime under real workloads. You'll sit at the seam between silicon/packaging and the systems our accelerators run in, partner closely with our OSATs, and own the answer to "will this hold up in the field — and for how long?
What you'll do
  • Reliability analysis & risk assessment: Conduct physics-of-failure modeling for advanced accelerator packaging; assess thermal, mechanical, and electrical stressors; define and execute stress-test protocols including thermal cycling, electromigration, HTOL, HAST/uHAST, and power cycling
  • Failure analysis & root cause: Lead failure-mode analysis using C-SAM, X-ray CT, SEM, TEM, FIB, and EBSD; identify cracking, voiding, electromigration, and stress-induced damage; drive corrective/preventive action (8D, FMEA)
  • Reliability physics & lifetime prediction: Build and apply models (Coffin-Manson, Arrhenius, Black's equation) and FEA-based stress simulation to predict field lifetime and FIT under real accelerator thermal and power profiles
  • OSAT management & collaboration: Partner with assembly and test providers on reliability improvements; define requirements, ensure JEDEC/IPC/IEEE/MIL-STD compliance, monitor OSAT performance, and support supplier audits and qualifications
  • System-level reliability: Assess thermal, mechanical, and electrical stress interactions across package, board, and the system the accelerator ships in; drive design-for-reliability into the package and board ↔ package interface with packaging, materials, SI/PI, and thermal
  • Cross-functional close-out: Develop design guidelines and reliability best practices, and own the reliability data presented to internal teams and customers
  • Fleet & data-center reliability: Translate package- and system-level reliability into fleet availability targets — AFR, FIT, MTBF/MTTR, and availability "nines"; drive detection and mitigation of silent data corruption (SDC) / silent data errors in production; close the loop from field telemetry, returns, and RMA back into design and qual (reliability growth); partner with data-center operations, SRE/hardware-ops, and customers on serviceability and uptime for large-scale training and inference
  • Use and develop AI-assisted / ML tool flows to accelerate failure analysis, lifetime modeling, and failure prediction
What we're looking for:
  • MS or Ph.D. in Materials Science, Mechanical Engineering, Electrical Engineering, Applied Physics, or related field
  • 5+ years in 2.5D/3D advanced packaging reliability
  • Deep command of physics-of-failure methodology and strong materials-science knowledge, particularly interconnects and interfaces
  • Proficiency in statistical reliability analysis (Weibull, lognormal, acceleration modeling; JMP, Minitab, or Python)
  • Hands-on failure analysis with C-SAM, X-ray CT, SEM, TEM, FIB, and EBSD
  • Proven track record driving OSAT/partner improvements and managing qualifications
  • Familiarity with JEDEC, IPC, IEEE, and MIL-STD standards
  • Heterogeneous integration, fan-out packaging, chiplet architectures, HBM, or silicon-photonics packaging
  • Electrical reliability mechanisms (electromigration, dielectric/TDDB breakdown)
  • Design-for-reliability (DFR), prognostics, and health management for electronic systems
  • AI-driven reliability modeling or machine learning for failure prediction
  • High-power / high-current package reliability for accelerators or GPUs; customer-facing qualification experience
Compensation

Final offers depend on level, location, and skills relevant to the role. Additional compensation: equity grant per company guidelines; medical / dental / vision; 401(k); standard PTO.

Visa Sponsorship

DensityAI sponsors qualified candidates for H-1B, O-1, TN, E-3, and other employment-based visas, and we welcome applicants on F-1 OPT and STEM-OPT. Work authorization is required at start; we provide immigration support to secure or transfer status.

Export Controls

Aspects of this role may involve access to information subject to U.S. export controls (EAR/ITAR). We may discuss licensing or scope adjustments during the interview.

Equal Opportunity

DensityAI is an Equal Opportunity Employer. We do not discriminate on the basis of race, color, religious creed, national origin, ancestry, physical or mental disability, medical condition, genetic information, marital status, sex, gender, gender identity, gender expression, age (40+), sexual orientation, military or veteran status, pregnancy, or any other status protected by law. We comply with the California CROWN Act and provide reasonable accommodations on request.

Full compensation packages are based on candidate experience and relevant certifications.

California pay range
$220,000—$350,000 USD

Nearby locations

Posting ID: 1271267404 Posted: 2026-07-01 Job Title: Hardware Reliability Engineer