Taguig City, Philippines, Philippines Remote (Global)

Acquireai is hiring a Site Reliability Engineer

Acquireai is looking for a Site Reliability Engineer to serve as the guardian of our production systems. You will ensure the reliability, scalability, and performance of our IoT telemetry platform by defining SLOs, automating operational processes, and building the infrastructure and tooling that enables our engineering teams to deploy with confidence.

What You'll Do

  • Define, monitor, and enforce Service Level Objectives (SLOs) and error budgets across all production systems.
  • Track error budget burn rates and make data-driven decisions to halt risky deployments when thresholds are exceeded.
  • Implement comprehensive monitoring and alerting strategies using Prometheus, Grafana, and PagerDuty.
  • Design and implement Infrastructure as Code (IaC) solutions using Pulumi with TypeScript.
  • Manage and optimize AWS services including EKS (Elastic Kubernetes Service), MSK (Managed Streaming for Kafka), SingleStore, MongoDB, and S3.
  • Automate operational processes to eliminate toil, targeting any task that consumes more than 2 engineer-days per quarter.
  • Serve as incident commander during production outages and service degradations.
  • Lead comprehensive post-mortem processes within 48 hours of incidents and drive 'never-again' corrective actions to completion.
  • Maintain and improve incident response procedures and runbooks.
  • Implement and enforce least-privilege IAM policies across all AWS resources.
  • Manage security patch pipelines and vulnerability remediation processes.
  • Support compliance initiatives including SOC2 and ISO 27001 certification requirements.
  • Participate in follow-the-sun on-call rotation with one week primary/secondary commitment every five weeks.
  • Provide 24×7 support coverage across AU/NZ, EU/ZA, and MX time zones.
  • Maintain operational runbooks and knowledge transfer documentation.

What We're Looking For

  • Proven experience defining and enforcing Service Level Objectives (SLOs) and error budgets in a production environment.
  • Deep hands-on experience with monitoring and alerting tools like Prometheus and Grafana.
  • Expertise in Infrastructure as Code using Pulumi, Terraform, or similar tools.
  • Strong experience managing and optimizing AWS services, particularly EKS and MSK.
  • Proficiency in a scripting or programming language such as TypeScript, Python, or Go.
  • Experience automating operational workflows and eliminating manual toil.
  • Demonstrated ability to lead incident response and post-mortem processes.
  • Strong knowledge of cloud security best practices, including IAM policy management and vulnerability remediation.
  • Experience supporting SOC2, ISO 27001, or similar compliance frameworks.
  • Willingness to participate in a global on-call rotation.

Technical Stack

  • Monitoring & Alerting: Prometheus, Grafana, PagerDuty
  • Infrastructure as Code: Pulumi, TypeScript
  • Cloud Platform: AWS
  • Core Services: EKS (Elastic Kubernetes Service), MSK (Managed Streaming for Kafka), SingleStore, MongoDB, S3

Work Mode

This is a global, remote position. Candidates should be located in and authorized to work in the AU/NZ, EU/ZA, or MX time zones to support our follow-the-sun on-call model.

Acquireai is an equal opportunity employer.

Required Skills
PrometheusGrafanaPagerDutyPulumiTypeScriptAWSEKSMSKSingleStoreMongoDBKubernetesInfrastructure as CodeMonitoringAlertingCloud Architecture PrometheusGrafanaPagerDutyPulumiTypeScriptAWSEKSMSKSingleStoreMongoDBKubernetesInfrastructure as CodeMonitoringAlertingCloud Architecture
Starting a business in Thailand?

Company registration done right

Foreign ownership rules, licenses, tax registration — Thai business setup has many moving parts. SVBL guides you through every step with full legal compliance.

Company registration & structure
Foreign ownership solutions
License & tax registration
BOI promotion eligibility
Start your business
100% foreign ownership possible
About company
Acquireai
We’re an award-winning global outsourcer providing contact center and back office services on behalf of our global clients.
All jobs at Acquireai Visit website
Job Details
Department Information Technology
Category infrastructure
Posted 2 months ago