Remote (Global)

LILT is hiring an AI Benchmark Engineer - Native Language Specialist | Czech

LILT is seeking a AI Benchmark Engineer - Native Language Specialist with native fluency in Czech. In this role, you will design, build, and validate multilingual software benchmarks for large language models, creating high-signal tasks that test a model's ability to handle multilingual environments without relying on English translation.

What You'll Do

  • Task Engineering: Evaluate the performance of Coding Agents.
  • Asset Creation: Build realistic task environments using datasets and files in your native language.
  • Prompting & Translation: Identify failure points where AI does not work correctly in your native language.
  • Implementation & Verification: Support the development of robust solutions and write highly reliable, deterministic verifier scripts.
  • Calibration & Execution: Analyze execution logs and calibrate task difficulty using standard Terminal-Bench run configurations against various model tiers.
  • Quality Assurance: Participate in a rigorous, 4-layer human quality control process alongside automated LLM-based checks to ensure fairness, grammatical accuracy, and benchmark integrity.

What We're Looking For

  • 5+ years of industry experience in software engineering.
  • Proven track record at leading technology companies and/or graduation from top-tier engineering universities.
  • Native or near-native fluency in Czech, with a deep understanding of its grammar, register, and phrasing rules. High English proficiency.
  • Strong proficiency in Python, standard shell scripting, and data processing.
  • Extensive experience with Terminal/CLI-based development workflows and a working familiarity with coding agents.
  • Deep technical understanding of multilingual text processing pitfalls including encoding/decoding robustness, Unicode normalization, locale-dependent conventions, text I/O, toolchain interoperability, and safe string operations.

Technical Stack

  • Python
  • Shell scripting

Work Mode

This is a global position.

LILT is an equal opportunity employer. We extend equal opportunity to all individuals without regard to an individual’s race, religion, color, national origin, ancestry, sex, sexual orientation, gender identity, age, physical or mental disability, medical condition, genetic characteristics, veteran or marital status, pregnancy, or any other classification protected by applicable local, state or federal laws.

Required Skills
PythonShell ScriptingMachine LearningNatural Language ProcessingAI BenchmarkingData AnalysisCzech LanguageEnglish LanguageQuality EvaluationStatistical AnalysisLarge Language ModelsAI/ML Systems
Want to work from Thailand?

Join a remote network built for tech talent

Iglu gives you real employment in Southeast Asia — visa, work permit, and projects included. Pick what you work on, earn performance-based pay, and live where you want.

Legal employment in Thailand & Vietnam
Choose your own projects
Performance-based revenue sharing
Relocation support available
Join Iglu
200+ professionals worldwide
About company
LILT
LILT builds multilingual AI and human-verified services that make the world's information available to everyone, regardless of language. The company serves Enterprises, Governments, and AI Developers worldwide.
All jobs at LILT Visit website
Job Details
Category data
Posted 2 months ago