Large Language Model Evaluation and Training Datasets
We are building a comprehensive dataset to evaluate and train large language models (LLMs). Our goal is to create verifiable software development tasks based on public repository histories in a synthetic approach with human-in-the-loop. The ideal candidate will work closely with researchers to expand dataset coverage to different types of tasks.
About the Role
We are seeking an experienced software engineer to contribute to this project. The ideal candidate will have strong experience with software development, GitHub, and related technologies. They will work closely with researchers to design and identify challenging repositories and issues for LLMs.
Responsibilities and Requirements
* Work with high-quality public GitHub repositories.
* Contribute to the development of our project, including software engineering, automation, and quality assurance.
* Collaborate with researchers to identify challenging repositories and issues for LLMs.
* Strong experience with software development, GitHub, and related technologies.
* Experience working with well-maintained, widely-used repositories with 500+ stars.
* Proficiency with Git, Docker, and basic software pipeline setup.
* Ability to understand and navigate complex codebases.
* Comfortable running, modifying, and testing real-world projects locally.