 
        
        Crossing Hurdles is a recruitment firm. We refer top candidates to our partners working with the world’s leading AI research labs and fast-growing startups to help build cutting-edge technology.
Role: LLM - Sr. Software Engineer – LLM Evaluation & Repository Validation
Type of Role: Short-term Contract
Engagement Duration: 1 month (with possible extensions)
Commitment: 10–40 hrs/week (flexible, partial PST overlap required)
Start Date: Immediate
Rate Range: $80–$125/hour
Project Overview
We’re building high-quality evaluation and training datasets to improve how Large Language Models (LLMs) interact with realistic software engineering tasks. Engineers will work on diverse projects such as enabling models to traverse complex code bases or building agents that boost model performance.
Role Overview — What Does a Typical Day Look Like?
 * Work across multiple projects aimed at improving LLM performance on code.
 * Lead and deliver end-to-end agent use cases (e.g., home automation agents, coding copilots, creative design assistants).
 * Collaborate with the team to identify edge cases and ambiguities in model behavior.
 * Review and compare 3–4 model-generated code responses per task using a structured ranking system.
 * Evaluate code diffs for correctness, quality, style, and efficiency.
 * Provide clear, structured rationales for evaluation decisions.
Required Skills & Experience
 * Several years of software engineering experience, including 1+ continuous years at top-tier product companies (e.g., Google, Stripe, Amazon, Apple, Meta, Microsoft, Datadog, Dropbox, Shopify, PayPal, IBM Research).
 * Strong expertise in full-stack application development and deploying scalable, production-grade software.
 * Deep understanding of software architecture, debugging, and code quality assessment.
 * Proven ability to review code diffs and evaluate correctness, maintainability, and efficiency.
 * Proficiency in backend languages (Java, Rust, Go, Node, Python, C++) and frontend frameworks (Typescript, JavaScript, React, Vue, Angular, jQuery).
 * Excellent oral and written communication skills.
P.S. After applying, within 1–2 days, you will hear from us via email and LinkedIn InMail, so please keep track of it.