Join our team and build your career with us
Role: SwarmBench Task Engineer — Knowledge / Research
Geo: Bangladesh, Brazil, Colombia, Egypt, Ghana, India, Indonesia, Kenya, Nigeria, Turkey, Vietnam
Duration : Short Term Contract ( 4 weeks)
Required Qualifications:
We are hiring a Research Engineer/Research Scientist with 5+ years of academic or industry research experience to build and evaluate multi-agent AI benchmark tasks.
Analyze large document collections and extract structured insights
Create JSON-based ground-truth oracles and validation schemas
Design LLM judge prompts for response evaluation
Strong Python scripting and data processing skills required
Experience with AI benchmarks like SWE-bench and Terminal-bench
Hands-on experience with Docker, Dockerfiles, and container debugging
Strong attention to detail and analytical thinking required
NLP, information extraction, dataset curation, or LLM evaluation experience is a plus
Experience with literature reviews, scientific, legal, or medical document analysis preferred