Lead the technical direction of full-stack systems in a high-stakes environment where reliability and correctness are non-negotiable. As a Staff Full Stack Engineer, you will own architectural decisions, drive AI integration across engineering workflows, and ensure systems operate securely in nuclear and utility settings—both in the cloud and on isolated networks.
What You'll Do
- Lead design and code reviews for distributed systems, focusing on safety, maintainability, and typed interface contracts.
- Architect resilient backend services with retry logic, circuit breakers, dead-letter queues, and backpressure controls.
- Build and scale AI-powered features—from retrieval pipelines to secure on-prem deployments—and integrate them into existing engineering tooling.
- Diagnose and resolve production incidents, including deep dives into error tracking, performance issues, and infrastructure failures.
- Implement automated patterns for testing, code review, log analysis, and postmortem generation using AI assistance.
- Define safe, auditable uses of AI copilots across development workflows, balancing speed with compliance and traceability.
- Establish SLOs, error budgets, and telemetry standards that support compliance with SOC 2 and ISO 27001 frameworks.
- Collaborate with cybersecurity teams to enforce least-privilege access, identity management, and audit logging.
- Coach engineers through complex challenges, promoting a culture of precision, safety, and disciplined innovation.
- Run incident simulations, root cause analyses, and resilience testing to strengthen system readiness.
What We Require
- Proven experience delivering software in regulated or safety-critical domains such as energy, healthcare, or industrial systems.
- Strong hands-on skills in debugging distributed systems, designing APIs, and managing infrastructure topologies.
- Deep commitment to correctness—evident in practices like idempotent jobs, safe schema migrations, and secure deployment pipelines.
- Ability to communicate clearly during high-pressure incidents and guide teams toward resolution.
- Experience working across engineering, security, product, and operations to deliver mission-critical solutions.
- U.S. citizenship or permanent residency due to Department of Energy export control requirements.
Preferred Background
- Experience with retrieval-augmented generation (RAG), vector search, feature stores, or LLM operations including prompt management and evaluation.
- Familiarity with nuclear or utility industry systems such as Maximo or DevonWay.
- Background in compliance frameworks like SOC 2 or ISO 27001.
- Integration experience with Microsoft 365, enterprise identity providers, or operational data platforms.
Technology Environment
Our stack includes React, Python, FastAPI, PostgreSQL, Redis, RabbitMQ, Celery, Docker, Kubernetes, and CI/CD via GitHub Actions. We monitor with Sentry, Netdata, and OpenTelemetry, test with PyTest and Cypress, and enforce security through SBOM scanning, secrets management, hardened images, and secure update channels. Fleet telemetry, backup systems, and disaster recovery runbooks ensure operational continuity.
Work Model
This is a hybrid role based at our Phoenix headquarters, with at least 80% in-office presence and one designated remote day per week (typically Wednesday). The position supports critical infrastructure, requiring alignment with on-site operations and compliance protocols.
Compensation & Benefits
Salary ranges from $200,000 to $230,000, with an equity grant between 0.05% and 0.25%. We offer unlimited paid time off and comprehensive health, dental, and vision coverage. Engineering culture emphasizes technical rigor, cross-domain collaboration, and responsible adoption of AI in high-consequence environments.
