Lead the technical direction of machine learning systems by designing, training, and deploying large-scale models that process intricate biomedical data. This role demands deep expertise in model architecture, training infrastructure, and scientific problem-solving to build intelligent systems with real-world impact.
Key Responsibilities
- Develop and evaluate large-scale models, including Large Language Models, diffusion architectures, and Graph Neural Networks
- Own the full lifecycle of training pipelines—from data loading and batching to distributed training and checkpoint management
- Make informed decisions on model design, loss functions, optimization techniques, and scaling behavior
- Build and refine distributed training systems using data and model parallelism, sharding, and mixed-precision methods
- Work closely with data engineering teams to shape ML-ready datasets and streaming interfaces
- Transform ambiguous scientific or product goals into reliable, scalable machine learning solutions
- Lead evaluation efforts, including ablation studies and iterative improvements focused on generalization and reproducibility
- Guide architectural choices around model serving, inference performance, and lifecycle management
- Provide technical leadership through design reviews, mentoring, and cross-functional collaboration
Required Qualifications
- Minimum of 5 years of professional experience in machine learning or applied AI
- Proven track record of training large models in production environments, beyond experimental prototypes
- Hands-on experience with LLMs, diffusion models, or GNNs
- Strong proficiency in PyTorch or comparable deep learning frameworks
- Deep knowledge of distributed training techniques, including parallelism and performance tuning
- Experience managing large datasets and high-throughput data pipelines
- Solid software engineering practices: writing clean, testable code and debugging at scale
- Ability to articulate technical trade-offs clearly to both technical and non-technical audiences
Preferred Qualifications
- Experience with reinforcement learning, fine-tuning, or preference-based optimization such as RLHF
- Familiarity with model compression, distillation, or inference optimization techniques
- Production deployment experience with low-latency inference systems
- Background in multimodal learning or foundation models
- Work history in fast-paced R&D or startup environments
- Contributions to open-source ML tools or research codebases
Technical Environment
PyTorch, Large Language Models, diffusion models, Graph Neural Networks, distributed training, data and model parallelism, sharding, mixed precision, high-throughput data pipelines
Compensation & Work Environment
- Competitive salary with meaningful equity participation
- Opportunity to shape foundational ML systems applied to real scientific challenges
- High autonomy and minimal bureaucracy, with emphasis on technical rigor
- Flexible remote or hybrid work model
- Collaborative environment working alongside data, infrastructure, and scientific teams
Our Culture
We value deep technical expertise and empower engineers with ownership over model design and training strategy. Our teams thrive on autonomy and intellectual challenge, and we believe inclusive, diverse perspectives lead to stronger science and better systems. We are committed to equal opportunity and welcome applicants from all backgrounds.
