San Francisco, United States of America On-site

Lambda is hiring a Staff Storage Engineer

About the Role

Lead the evolution of storage systems powering next-generation AI infrastructure. In this role, you'll define the strategic direction for storage architecture, ensuring scalability, performance, and reliability across multi-petabyte environments. Your expertise will directly influence how storage integrates with distributed GPU clusters running intensive machine learning workloads.

Key Responsibilities

  • Lead vendor selection and request-for-proposal processes, using performance data and technical analysis to guide decisions
  • Analyze AI and ML workload behaviors to shape storage design, tuning, and capacity planning
  • Drive operational improvements and coordinate deployment strategies across engineering teams
  • Collaborate with leadership to translate business requirements into technical specifications
  • Oversee engineering execution by delegating complex tasks and maintaining alignment with senior stakeholders

Requirements

  • Minimum of 8 years designing, deploying, and managing large-scale production storage systems
  • Proven experience with enterprise storage platforms including Vast, Weka, DDN, NetApp, or equivalent
  • Familiarity with file, block, and object storage architectures and use cases
  • Working knowledge of NFS, SMB, and POSIX-compliant access protocols
  • Hands-on experience with NVMe over Fabrics—via TCP, InfiniBand, or RoCE
  • Understanding of RDMA, GPUDirect Storage, and parallel file systems for high-throughput environments
  • Background in storage security, encryption, data reduction, and multi-tenancy models
  • Experience with backup, disaster recovery, and data protection frameworks
  • At least 5 years using Infrastructure as Code tools such as Terraform or Ansible

Preferred Qualifications

  • Experience with Kubernetes storage interfaces, including CSI and COSI drivers
  • Deep knowledge of storage performance analysis and optimization
  • Familiarity with public cloud storage services and networking (SDN, identity, distributed systems)
  • Track record deploying and managing Software Defined Storage solutions
  • Implementation experience with monitoring platforms for storage and related infrastructure
Required Skills
VastWekaDDNNetAppPureStorageDellIBMHPENFSSMBFile StorageBlock StorageObject StorageNVME/TCPNVME/IBNVME/RoCE Storage EngineeringLarge-Scale Storage SystemsFile and Block StorageObject StorageNFSSMBPOSIXNetAppPureStorageDellIBMHPEVastWekaDDN
Planning long-term in Thailand?

Full relocation support, start to finish

From visa strategy to housing, banking, and schools for your family — SVBL plans and manages every detail of your move to Thailand so nothing falls through the cracks.

Complete relocation planning
Family visa & school enrollment
Banking & insurance setup
Cultural integration support
Plan your move
One partner for everything
About company
Lambda
Lambda, The Superintelligence Cloud, is a leader in AI cloud infrastructure serving tens of thousands of customers. The company builds and scales AI cloud infrastructure, including high-performance storage, networking, and compute systems for AI training and inference. Lambda's mission is to make compute as ubiquitous as electricity and give everyone the power of superintelligence.
All jobs at Lambda Visit website
Job Details
Category infrastructure
Posted a month ago