global Remote (Global)

Bjak is hiring a MLOps Engineer (Remote)

About the Role

The engineer will bridge machine learning and operations by building reliable pipelines, automating workflows, and ensuring model performance in production environments.

Responsibilities

  • Design and manage CI/CD pipelines for machine learning models
  • Develop infrastructure for automated model training and deployment
  • Monitor system performance and model behavior in production
  • Collaborate with data scientists to operationalize ML workflows
  • Implement version control for models, data, and code
  • Optimize resource allocation for training and inference workloads
  • Ensure reproducibility across development and production environments
  • Integrate logging, monitoring, and alerting for ML systems
  • Support secure and compliant deployment practices
  • Troubleshoot issues in distributed ML environments
  • Maintain documentation for ML pipelines and systems
  • Scale infrastructure to handle growing model demands
  • Work with containerization and orchestration tools
  • Apply infrastructure-as-code principles
  • Improve model latency and throughput
  • Enforce access controls and authentication for ML services
  • Evaluate new tools and frameworks for MLOps efficiency
  • Contribute to incident response for production outages
  • Ensure data quality and consistency in training pipelines
  • Automate testing for model performance and data drift

Nice to Have

  • Experience with MLflow or similar MLOps platforms
  • Knowledge of TensorFlow or PyTorch deployment
  • Familiarity with feature stores
  • Experience in regulated industries
  • Contributions to open-source MLOps projects
  • Advanced degree in computer science or related field

Compensation

Competitive salary with performance-based incentives

Work Arrangement

Fully remote with flexible hours

Team

Collaborative engineering team focused on machine learning systems

Tech Stack

  • Primary languages: Python, Bash
  • Cloud: AWS and GCP
  • Containerization: Docker, Kubernetes
  • CI/CD: GitLab CI
  • Infrastructure as Code: Terraform
  • Monitoring: Prometheus, Grafana
  • Model Serving: TensorFlow Serving, TorchServe
  • Data Pipeline Tools: Apache Airflow

Culture & Values

  • Emphasis on transparency and open communication
  • Commitment to continuous learning and improvement
  • Support for remote collaboration and asynchronous workflows
  • Focus on work-life balance and sustainable pace
  • Inclusive environment that values diverse perspectives

Available for qualified candidates

Required Skills
KubernetesMLOpsMachine LearningLLMCloud InfrastructureDistributed SystemsModel Deployment
About company
Bjak
Bjak is focused on providing access to affordable and sustainable financial services for people in ASEAN. Headquartered in Malaysia, Bjak is the largest insurance portal in Southeast Asia. Their main portal, Bjak.com, helps millions find the insurance policy with the best value and highest coverage. They invest in technology such as Custom API, trading systems, and data science to enable easy access to financial services.
All jobs at Bjak Visit website
Job Details
Category other
Posted 7 months ago