Role
The role holder will form part of the small team of researchers working in Lancaster and York on the “Developing Y-ACCDIST’s robustness for use on Levantine Arabic dialects” CELIA subproject. We are looking for someone who can support development of a robust workflow for automatic speech recognition (ASR) in audio recordings of dialectal Arabic speech, to generate transcripts for use in training and testing of a system to provide semi-automatic language indications for Arabic language variants. This task entails experimentation with alternative romanised transcription systems in addition to Arabic script, and development of processes for transformation of transcripts between transcription systems, as well as development of protocols for comparative evaluation of the accuracy of different types of transcript. The subproject will be working with identified specific sources of gold-standard training data in Arabic dialects.
The ideal role holder will have experience of development and/or use of ASR workflows for dialectal Arabic speech, as well as the technical expertise to develop, document and evaluate aspects of this type of workflow. The role holder will also have a high level of proficiency in Arabic, ideally including one or more Levantine Arabic dialects, as well as an excellent command of English. Key duties and responsibilities will include:
1. To support the development, documentation and evaluation of a novel workflow to provide semi-automatic language indications for Arabic language variants, in an end-use context which requires use of gold-standard training data.
2. To conduct individual and collaborative research.
3. To develop and initiate collaborative working internally and externally
4. To undertake appropriate organisational and administrative activities connected to the project.
Skills, Experience & Qualification needed
5. PhD in Linguistics or Computer Science or another related field, or equivalent experience
6. Knowledge of the linguistic and computational underpinnings of speech technology
7. Ability to develop solutions for an ASR workflow in appropriate coding languages e.g. Python
8. High level proficiency in English and Arabic, ideally including one or more Levantine Arabic dialects
9. A strong background of knowledge of speech technology and/or natural language processing applications for Arabic, and proven familiarity with the challenges of research in this area is desirable
10. Experience of working with different systems for orthographic representation or transcription of dialectal Arabic is desirable
Interview date: To be confirmed