Santa Monica, California, United States On-site USD 150,000 - 200,000 Yearly

favorited is hiring a Senior Site Reliability Engineer

About the Role

Favorited is hiring a Senior Site Reliability Engineer to ensure the reliability, scalability, and performance of the infrastructure powering our real-time platform. You will play a key role in building and maintaining systems that support high-traffic applications used by a rapidly growing global audience.

What You'll Do

  • Design, implement, and maintain highly reliable and scalable infrastructure supporting real-time applications.
  • Build automation and tooling to improve system reliability, deployment processes, and operational efficiency.
  • Develop and maintain monitoring, logging, and alerting systems to ensure high availability and rapid incident response.
  • Partner closely with engineering teams to improve service reliability, performance, and observability.
  • Support incident response, root cause analysis, and postmortems, ensuring learnings are incorporated into system improvements.
  • Optimize infrastructure for performance, cost efficiency, and scalability.
  • Manage and scale containerized environments using Docker, Kubernetes, and related orchestration technologies.
  • Help define and enforce reliability standards, SLOs, and operational best practices across engineering teams.
  • Continuously evaluate new infrastructure tools and practices to improve system resilience and developer productivity.

What We're Looking For

  • 6+ years of experience in Site Reliability Engineering, DevOps, or Infrastructure Engineering roles.
  • Experience managing infrastructure for large-scale systems supporting millions of users.
  • Strong expertise with cloud infrastructure, ideally Google Cloud Platform (GCP).
  • Hands-on experience with Kubernetes, container orchestration, and distributed systems.
  • Experience implementing monitoring and observability systems (Prometheus, Grafana, Datadog, or similar).
  • Strong scripting or programming experience in languages such as Python, Go, or TypeScript.
  • Deep understanding of reliability engineering practices including SLOs, SLIs, and incident management.
  • Strong collaboration skills and ability to work cross-functionally with engineering teams.

Nice to Have

  • Experience supporting real-time streaming, gaming, or large-scale consumer applications.
  • Familiarity with event-driven architectures and large-scale data processing systems.
  • Experience optimizing infrastructure costs in high-growth environments.

Technical Stack

  • Google Cloud Platform (GCP)
  • Kubernetes, Docker
  • Prometheus, Grafana, Datadog
  • Python, Go, TypeScript

Benefits & Compensation

  • Base salary: $150k - $200k + equity (options)
  • Unlimited PTO
  • 401(k) plan
  • Comprehensive health insurance
  • Paid company holidays

Work Mode

This role is onsite in Santa Monica.

Favorited is an equal opportunity employer.

Required Skills
Google Cloud Platform (GCP)KubernetesDockerPrometheusGrafanaDatadogPythonGoTypeScriptSite Reliability EngineeringDevOpsInfrastructure EngineeringDistributed SystemsMonitoringObservability Google Cloud Platform (GCP)KubernetesDockerPrometheusGrafanaDatadogPythonGoTypeScriptSite Reliability EngineeringDevOpsInfrastructure EngineeringDistributed SystemsMonitoringObservability
Starting a business in Thailand?

Company registration done right

Foreign ownership rules, licenses, tax registration — Thai business setup has many moving parts. SVBL guides you through every step with full legal compliance.

Company registration & structure
Foreign ownership solutions
License & tax registration
BOI promotion eligibility
Start your business
100% foreign ownership possible
About company
favorited
At favorited, we believe that digital communities should be more than just spaces to watch content. Our platform is a place to connect, engage, and play, and empowers creators by enhancing audience participation and fostering deeper connections.
All jobs at favorited Visit website
Job Details
Department Information Technology
Category infrastructure
Posted 2 months ago