Jobs
My ads
My job alerts
Sign in
Find a job Career Tips Companies
Find

Freelance agent evaluation engineer

Manchester
Freelance
Postaladdress Uk
Engineer
€33,000 - €30,855 a year
Posted: 28 April
Offer description

Job Overview

Mindrift connects specialists with project‑based AI opportunities for leading tech companies, focused on testing, evaluating, and improving AI systems. Participation is project‑based, not permanent employment.


Responsibilities

You’ll develop a dataset to evaluate AI coding agents by creating challenging tasks and evaluation criteria within realistic simulated environments:

* Build virtual companies following a high‑level plan – codebase, infrastructure, and context (conversations, documentation, tickets) that form a realistic environment with development history.
* Assemble and calibrate tasks from intermediate states of the virtual company: craft the prompt, define evaluation criteria, and ensure the task is solvable and the evaluation is fair.
* Design tasks set in isolated environments – emulations of a developer’s workstation: a Linux machine with development tools (terminal, CLI), MCP servers (repository, task tracker, messenger, documentation, etc.), and a real web application codebase.
* Write tests that accept all correct solutions and reject incorrect ones – neither too strict (breaking on valid approaches) nor too lenient (passing bad ones).
* Iterate with an AI agent on tests – verifying they catch real problems, don’t miss bad solutions, and don’t break on good ones.
* Review code written by agents, analyze why an agent failed or succeeded, and design edge cases and adversarial scenarios.
* Iterate based on feedback from expert QA reviewers who score your work on quality criteria.


What this is NOT

* Data labeling
* Prompt engineering
* Writing code from scratch – the agent writes most of the code; you guide and evaluate.


Qualifications

* Degree in Computer Science, Software Engineering, or related fields.
* 5+ years in software development, primarily Python (FastAPI, pytest, async/await, subprocess, file operations).
* Background in full‑stack development, with experience building React‑based interfaces (JavaScript/TypeScript) and robust back‑end systems.
* Experience writing tests (functional, integration – not just running them).
* Docker containerization and familiarity with infrastructure tools (Postgres, Kafka, Redis).
* CI/CD understanding (GitHub Actions as a user: triggers, labels, reading results).
* English proficiency – B2.
* Comfortable reading and reasoning about code across the stack; expertise in every area is not required.


Compensation

On this project, contributors can earn up to $50 per hour equivalent, depending on their level and pace of contribution. Compensation varies across projects based on scope, complexity, and required expertise.

#J-18808-Ljbffr

Apply
Create E-mail Alert
Job alert activated
Saved
Save
Similar job
Senior python data scraping engineer (freelance)
Manchester
Freelance
Mindrift
Engineer
€27,000 - €45,622 a year
Similar job
Freelance data science engineer (python & sql)
Manchester
Freelance
Mindrift
Engineer
€33,000 - €30,855 a year
Similar job
Freelance data scraping engineer (python)
Manchester
Freelance
Mindrift
Engineer
€23,000 - €74,592 a year
See more jobs
Similar jobs
Engineering jobs in Manchester
jobs Manchester
jobs Greater Manchester
jobs England
Home > Jobs > Engineering jobs > Engineer jobs > Engineer jobs in Manchester > Freelance Agent Evaluation Engineer

About Jobijoba

  • Career Advice
  • Company Reviews

Search for jobs

  • Jobs by Job Title
  • Jobs by Industry
  • Jobs by Company
  • Jobs by Location
  • Jobs by Keywords

Contact / Partnership

  • Contact
  • Publish your job offers on Jobijoba

Legal notice - Terms of Service - Privacy Policy - Manage my cookies - Accessibility: Not compliant

© 2026 Jobijoba - All Rights Reserved

Apply
Create E-mail Alert
Job alert activated
Saved
Save