Role Overview
In this position, you will play a key role in building and refining engineering systems that support application development, deployment automation, and system observability. Working in a fully remote setup, you’ll help shape reliable, scalable solutions across cloud infrastructure, containerized environments, and monitoring platforms.
Key Responsibilities
- Build and manage automated CI/CD pipelines using GitHub Actions to streamline development workflows
- Apply Git best practices for version control, including branching models, pull request reviews, and repository automation
- Develop backend services and internal tools using TypeScript and Python
- Package and optimize applications using Docker following container security and efficiency standards
- Deploy and manage workloads on Kubernetes, including configuration, scaling, and issue resolution
- Provision and maintain cloud resources and services in Microsoft Azure
- Implement observability through centralized logging, metrics collection, and distributed tracing
- Configure and manage monitoring agents and integrations with Datadog
- Organize and analyze log data using Splunk, ensuring efficient storage and fast retrieval
- Design dashboards and alerting rules to proactively detect and respond to system issues
- Lead incident triage efforts and conduct root cause analysis for production events
- Enhance system reliability, performance, and engineering processes through continuous improvement
- Work closely with both technical and non-technical team members to support stable system delivery
- Keep documentation updated as systems evolve
Required Qualifications
- Minimum of 3 years of hands-on engineering experience
- Proven experience with CI/CD automation, particularly with GitHub Actions
- Strong command of Git workflows, repository management, and collaboration practices
- Development experience using TypeScript or Python
- Practical knowledge of Docker and container optimization techniques
- Working expertise with Kubernetes, including deployment patterns and troubleshooting
- Familiarity with Microsoft Azure services and infrastructure management
- Experience setting up monitoring and alerting systems using Datadog and Splunk
- Understanding of logging strategies, metric collection, and alert dashboard creation
- Comfortable navigating and debugging systems via the command line
- Ability to document technical systems clearly and keep documentation current
- Demonstrated ability to work independently, take ownership, and solve problems proactively
- Clear and effective communication skills for both technical and general audiences
- Experience providing timely updates during incidents and maintaining transparency
- Strong habits in asynchronous collaboration, including code reviews, issue tracking, and team messaging
Technology Environment
GitHub Actions, Git, TypeScript, Python, Docker, Kubernetes, Microsoft Azure, Datadog, Splunk, CI/CD pipelines, containerization, observability, logging, monitoring, alerting, command-line interface
Work Environment
This role operates in a fully remote model with team members distributed globally. We support flexible schedules within a collaborative framework, emphasizing clear communication and accountability. Our culture values inclusion, innovation, and personal growth, fostering an environment where diverse perspectives are welcomed and every voice is heard.


