Jobs
My ads
My job alerts
Sign in
Find a job Career Tips Companies
Find

Software engineer, rl data

Harrow
Anthropic
Software engineer
€300,820.45 a year
Posted: 11 June
Offer description

About Anthropic

Anthropic’s mission is to create reliable, interpretable, and steerable AI systems. We want AI to be safe and beneficial for our users and for society as a whole. Our team is a quickly growing group of committed researchers, engineers, policy experts, and business leaders working together to build beneficial AI systems.


About The Role

Anthropic's RL Data team builds the systems that produce high-quality reinforcement learning data for Claude: data collection pipelines, human feedback tooling, the execution environments RL tasks run in, and the quality assurance that keeps training data trustworthy at scale. Our goal is to make Claude genuinely great at complex, real‑world work and to point those capabilities at the things that matter most, including AI safety research and beneficial deployments of AI. This is a foundational role on a new team: you'll help shape our technical direction and what we build first. The work is hands‑on and varied.


Key Responsibilities

* Own significant parts of our stack end‑to‑end, from technical architecture through the operational work that makes it succeed
* Build data collection pipelines, read the transcripts they produce, and iterate on prompts, evals, and graders until the output is good
* Develop and improve QA frameworks to catch reward hacking and ensure environment quality
* Build interfaces that make collecting human data fast and painless for the people providing it
* Harden execution environments—sandboxing, snapshotting, tool coverage—so tasks hold up at training scale
* Embed with the teams and domain experts who use our systems day‑to‑day: design pipelines and evals with them, support them directly, and ship the improvements they need
* Work with operations, security, and compliance partners to roll our systems out to new users, and manage technical relationships with external data vendors


Minimum Qualifications

* Strong software engineering skills and proficiency in at least one modern programming language—Python and TypeScript are used most often, but we value the ability to learn tools quickly
* Experience designing, building, and running backend systems or infrastructure
* Effective use of AI tools in your own day‑to‑day work
* Willingness to own problems end‑to‑end, including the parts that aren’t engineering
* Proactive, open communication: you can be trusted to run a workstream and to escalates early when something’s off
* Comfort iterating quickly in ambiguous, fast‑changing situations
* Care about the societal impacts of your work


Preferred Qualifications

* Experience building LLM‑powered systems: prompt pipelines, evals, or products with models in the loop
* Experience with reinforcement learning on LLMs: creating environments, rewards, graders, or training data
* Time as a forward‑deployed engineer, founder, or early‑startup engineer—roles where you owned the outcome, not just the code
* Experience shipping user‑facing products, or internal platforms people love: interviewing users, hunting down friction, measurably improving the experience
* Experience building data pipelines or integrations that move, transform, and index data from many sources
* Experience building connectors or integrations with third‑party tools and APIs, such as MCP servers
* Experience with containers, Kubernetes, or simulation infrastructure
* Experience handling sensitive data or working under tight security controls
* Experience working with external data vendors
* Basic familiarity with AI safety or security research


Representative Projects

* Take QA checks that a model has learned to game, and make them hold up under heavy optimization pressure
* Build a review flow that lets a busy expert check an RL task in under five minutes
* Cut the time from “rough task idea” to “QA‑passed RL task” from days to hours
* Spend a week with a team that uses our platform, then ship the fixes that help them most
* Harden a sandboxed environment so tasks behave correctly across millions of rollouts
* Onboard a new data vendor, and fix the rough edges they hit


Compensation

$320,000—$485,000 USD


Logistics

* Minimum education: Bachelor’s degree or an equivalent combination of education, training, and/or experience
* Required field of study: A field relevant to the role as demonstrated through coursework, training, or professional experience
* Minimum years of experience: Years of experience required will correlate with the internal job level requirements for the position
* Location‑based hybrid policy: We expect all staff to be in one of our offices at least 25% of the time. Some roles may require more time in our offices.
* Visa sponsorship: We do sponsor visas, but we cannot sponsor every role. If you receive an offer, we will make every reasonable effort to obtain a visa.
#J-18808-Ljbffr

Apply
Create E-mail Alert
Job alert activated
Saved
Save
Similar job
Senior software engineer c#
Hertford
Permanent
Software engineer
£70,000 a year
Similar job
Software engineer
Hertford
Permanent
Software engineer
Similar job
Microsoft dynamics365 f&o software engineer
London
83zero Limited
Software engineer
£80,000 a year
See more jobs
Similar jobs
It jobs in Harrow
jobs Harrow
jobs Greater London
jobs England
Home > Jobs > It jobs > Software engineer jobs > Software engineer jobs in Harrow > Software Engineer, RL Data

About Jobijoba

  • Career Advice
  • Company Reviews

Search for jobs

  • Jobs by Job Title
  • Jobs by Industry
  • Jobs by Company
  • Jobs by Location
  • Jobs by Keywords

Contact / Partnership

  • Contact
  • Publish your job offers on Jobijoba

Legal notice - Terms of Service - Privacy Policy - Manage my cookies - Accessibility: Not compliant

© 2026 Jobijoba - All Rights Reserved

Apply
Create E-mail Alert
Job alert activated
Saved
Save