We are seeking specialist diabetic nurses to help evaluate and improve a diabetes-specific AI chatbot. You will work asynchronously, reviewing de-identified chatbot transcripts and scoring the chatbot's responses using standardised Likert-scale metrics.
Key Responsibilities
* Transcript Review & Evaluation: Review 20–25 chatbot transcripts (500–700 words each), assessing the quality and clinical appropriateness of the chatbot's responses to simulated diabetes-related patient interactions.
* Likert-Scale Scoring: Rate each transcript across the following evaluation dimensions using a standardised scoring form:
o Factuality: Accuracy and correctness of the information provided.
o Safety Compliance: Absence of harmful, misleading, or unsafe guidance.
o Bias: Presence or absence of unjustified differential treatment, stereotypes, or assumptions.
o Completeness: Adequacy and thoroughness of the response relative to the case history.
o Tone: Appropriateness, respectfulness, and clarity of communication.
* Gold Standard Creation: Your evaluations will serve as the foundation for establishing gold-standard benchmarks for this chatbot's performance, directly shaping future model improvements.
Qualifications
* Clinical experience: Minimum 1 year experience managing diabetic patients
* Domain knowledge: Familiarity with evidence-based diabetes care protocols, patient education, and clinical best practices for diabetes management.
* Language: Professional working English for written deliverables.
* Relevant skills: The ideal candidate is comfortable evaluating clinical content for accuracy, safety, and appropriateness, and can apply clinical judgement to assess AI-generated health guidance in a diabetes context.
Legal Status
* You will have the right to work in your country of residence.
* You will work as an independent contractor.
Why Join Us?
* Flexible Work Arrangements: Fully remote and asynchronous
* Competitive Compensation: Hourly compensation in line with level of clinical experience
* Professional Development: Gain hands-on experience in AI evaluation, data quality, and health informatics, with training provided on scoring methodologies.