Responsibilities
- Build and maintain distributed systems handling petabyte to exabyte-scale data, supporting web crawling, preprocessing, classification, and multimodal pipelines across CPU and GPU environments.
- Design high-throughput search and retrieval systems capable of processing trillions of documents using vector, hybrid, and semantic methods, integrated with large language models for accurate, real-time knowledge extraction.
- Create robust inference serving platforms with load balancing, auto-scaling, key-value caching, batching, fault tolerance, monitoring via Prometheus and Grafana, CI/CD pipelines using Buildkite and ArgoCD, and performance benchmarking to ensure consistent uptime and low latency.
- Improve low-level system performance through optimization of CUDA kernels including GeMM and attention mechanisms, development of Triton and CUTLASS extensions, and techniques like quantization, distillation, and speculative decoding, alongside co-design of models and hardware for future architectures.
- Advance compiler and runtime technologies for machine learning frameworks such as JAX/XLA/MLIR, add custom support for next-generation GPUs, and build distributed profiling and debugging tools, while exploring high-speed interconnects including copper and optical solutions, SerDes, photonics, topology simulation, and vendor roadmaps.
- Orchestrate complex computing workloads across multiple clusters and cloud environments using Kubernetes, ensure data traceability and integrity, validate high-speed network fabrics, and implement telemetry, automation, and failure analysis systems to maintain production reliability.
Benefits
- Total compensation includes base pay, equity, full medical, vision, and dental coverage, 401(k) plan access, short- and long-term disability insurance, life insurance, and additional discounts and employee perks.
Compensation
Base salary is part of a broader rewards package including equity and benefits.
Work Arrangement
Not specified
Team
Not specified
Other
- All team members must demonstrate strong communication abilities and effectively convey technical concepts clearly and concisely to colleagues.
- A strong work ethic and the ability to prioritize tasks efficiently are essential.
Not specified


