Role Overview
We need an experienced AI Engineer to drive the development of advanced video intelligence systems. You'll lead the creation of multimodal pipelines that process visual, audio, and textual data, turning raw inputs into actionable insights. This role demands full ownership—from modeling and infrastructure to deployment and monitoring—with a strong emphasis on real-world reliability, not just experimental accuracy.
What You’ll Do
- Design and implement end-to-end video analysis systems that integrate vision, speech, and language understanding
- Extract nuanced signals such as sentiment, intent, and semantic meaning from complex, multimodal inputs
- Deploy and maintain self-hosted LLMs in secure, production-grade environments
- Optimize inference pipelines for performance, cost, latency, and scalability
- Ensure AI systems remain stable, secure, and efficient under real-world conditions
- Mentor engineers, lead technical discussions, and elevate team-wide engineering practices
- Collaborate with Product to define roadmaps and translate concepts like 'video listening' into working features
- Conduct a deep technical review of existing video capabilities and deliver improvements within the first 90 days
What We’re Looking For
- 5+ years (Senior) or 7–8+ years (Staff) in machine learning engineering with proven production impact
- Deep expertise in computer vision, video analysis, large language models, and multimodal systems
- Hands-on experience deploying models using Docker and Kubernetes in cloud or on-prem environments
- Track record of owning full AI system lifecycles—from design to monitoring in production
- Experience running LLMs in controlled, self-hosted settings with attention to security and reliability
- A proactive mindset: you clarify ambiguity, validate assumptions, and move projects forward independently
- Exceptional communication skills—able to simplify complex topics and align stakeholders
- Comfort operating in fast-moving, unstructured environments while delivering structured outcomes
Nice-to-Have
- Background in B2B SaaS or customer experience platforms
- Work history at a well-known technology company
- Experience with Arabic language processing, including multimodal NLP or OCR in video
- Knowledge of generative video models or diffusion techniques
- Optimization experience for edge AI deployments
Technology Environment
You’ll work with Docker, Kubernetes, cloud and on-prem infrastructure, computer vision frameworks, video analysis tools, self-hosted LLMs, and multimodal AI systems. The stack emphasizes inference efficiency, scalability, security, and full lifecycle ownership—from sentiment detection to semantic modeling and performance tuning.
Our Culture
We value ownership, autonomy, and building systems that launch and endure. We communicate with clarity, welcome respectful debate, and maintain high technical standards. Success is measured by deployment and impact—not just experiments. If you thrive in dynamic settings and excel at structuring complexity, this is the environment for you.


