Dawn Knight, Cardiff University (project PI, Principal Investigator) 

Dr. Dawn Knight is a Reader in Applied Linguistics at Cardiff University, UK. She was the Principal Investigator (PI) of the CorCenCC (National Corpus of Contemporary Welsh) project and is the Co-Principal Investigator of the Interactional Variation Online project (https://ivohub.com). Dawn has expertise in corpus linguistics, discourse analysis, digital interaction and non-verbal communication and was former Chair of the British Association for Applied Linguistics (BAAL). Dawn is the PI of the FreeTxt/TestunRhydd project. 

Paul Rayson, Lancaster University (project CI, Co-Investigator)

Professor Paul Rayson works in the School of Computing and Communications at Lancaster University, and is Director of the UCREL interdisciplinary research centre which carries out research in corpus linguistics and natural language processing (NLP). A long-term focus of his work is semantic multilingual NLP in extreme circumstances where language is noisy e.g. in historical, learner, speech, email, txt and other CMC varieties.

Mahmoud El-Haj, Lancaster University (project CI, Co-Investigator)

Dr. Mahmoud El-Haj, also known as Mo, is an NLP Lecturer in Computer Science at the School of Computing and Communications at Lancaster University. Mo received his PhD in Computer Science from The University of Essex working on Multi-document Summarization. His work is mainly towards Summarization, Information Extraction, Financial NLP and multilingual NLP with his work being applied to many languages including English, Arabic, Spanish, Portuguese and Welsh. He has an interest in under-resourced languages and building NLP datasets.  

Ignatius Ezeani, Lancaster University (project RA, Research Associate)

Dr Ignatius Ezeani is a Senior Teaching/Research Associate at Lancaster University. He is interested in the application of NLP techniques in building resources for low-resource languages including Igbo and Welsh. He works on the efficient adaption of existing NLP tools and techniques for creating task-oriented systems for low-resource languages.

Steve Morris, Cardiff University (Senior Research Associate)

Steve Morris is an Honorary Research Fellow in Applied Linguistics at Swansea University where previously he worked as an Associate Professor in Applied Linguistics and Welsh. Together with Dr Dawn Knight and Professor Tess Fitzpatrick, he was a co-creator of the CorCenCC (National Corpus of Contemporary Welsh) project on which he was also a Co-Investigator. The interdisciplinary interface between Applied Linguistics and the Welsh Language continues to be the prime focus of his work.

Back to top

Project Advisory Group

  • National Trust Wales
  • Cadw
  • National Museum Wales
  • Emyr Davies, CBAC | WJEC
  • Efa Gruffudd Jones, Chief Executive, National Centre for Learning Welsh

Back to top

Funding Acknowledgement

This project, which runs from 2022-2023, is funded by the AHRC.