United States of America Remote (Country) USD 125,000 - 180,000 Yearly

Nebius is hiring a Senior Hardware Support Engineer

About the Role

Role Overview

We are seeking a Senior Hardware Support Engineer to lead hardware reliability efforts in high-scale, production-critical environments. This individual will serve as a technical authority for diagnosing and resolving complex hardware and firmware issues, ensuring sustained system availability and performance. The role bridges engineering, operations, and vendor teams, focusing on root cause identification, corrective action, and long-term platform improvements.

Key Responsibilities

  • Lead in-depth investigations into hardware and firmware failures affecting production systems
  • Identify recurring failure patterns and drive systemic fixes to improve fleet reliability
  • Serve as the primary escalation point for critical hardware incidents impacting service
  • Collaborate with hardware vendors to expedite diagnostics, replacements, and firmware updates
  • Work closely with internal engineering to validate solutions and prevent future issues
  • Conduct pre-deployment testing of hardware and firmware for large-scale rollouts
  • Apply structured problem-solving frameworks to diagnose and document technical issues
  • Support on-site teams during urgent hardware events with technical guidance
  • Enhance tools and processes for monitoring, tracking, and reporting hardware failures
  • Contribute to strategic initiatives that improve long-term hardware resilience

Required Qualifications

  • Proven experience supporting server hardware in large-scale data center environments
  • Track record of diagnosing hardware and firmware root causes in production systems
  • Strong knowledge of server subsystems including CPU, memory, storage, power, and BMC
  • Experience managing technical escalations with hardware vendors and engineering teams
  • Proficiency in structured troubleshooting using incident or problem management practices
  • Ability to analyze logs, telemetry, and error data to identify failure trends
  • Experience coordinating with field operations during critical hardware events
  • Capacity to manage multiple high-impact investigations simultaneously
  • Clear and effective communication skills in cross-functional settings

Preferred Qualifications

  • Background in GPU-intensive, AI, or high-performance computing infrastructure
  • Experience managing firmware updates and validation at scale
  • Familiarity with Linux systems and infrastructure automation tools
  • History of improving hardware reliability metrics across large fleets

Benefits

  • Comprehensive medical, dental, and vision insurance
  • 401(k) plan with employer contribution
  • Flexible paid time off policy
  • Paid parental leave
  • Support for professional growth and development

Work Environment

This role supports remote work within the United States. Occasional travel may be necessary for on-site coordination or urgent hardware events. The position includes participation in incident response rotations for production-impacting issues.

Compensation

Base salary range: $125,000 – $180,000 per year, with an annual performance-based bonus. Equity may be offered based on experience and role scope.

Company Culture

The team values initiative, technical depth, and innovation. We operate in a collaborative, fast-moving environment focused on advancing AI cloud infrastructure. The organization supports global expansion and encourages contributions to long-term technological advancement.

Required Skills
server hardwaredata center operationsroot cause analysishardware failure diagnosisfirmware troubleshootingBMC configurationCPU/memory/storage networkingincident managementvendor coordinationproduction issue resolution server hardwaredata center operationsroot cause analysishardware failure diagnosisfirmware troubleshootingBMC configurationCPU/memory/storage networkingincident managementvendor coordinationproduction issue resolution
Landing international contracts?

Invoice globally with an EU company

GloPay creates an Estonian partnership for you automatically. Your clients get proper invoices, you keep 95% of payments. Setup takes 5 minutes, works in 100+ currencies.

EU-registered company for compliance
Multi-currency invoicing & payments
Expense tracking & tax reports
Money in your bank in 1 business day
Start invoicing free
5% per invoice • No subscriptions
About company
Nebius
Nebius is leading a new era in cloud computing to serve the global AI economy. It creates tools and resources for customers to solve real-world challenges without massive infrastructure costs.
All jobs at Nebius Visit website
Job Details
Department Engineering
Category infrastructure
Posted 2 months ago