Speech-to-text for Breton

Preben Vangberg; Leena Farhat

Speech-to-text for Breton

Research output: Contribution to conference › Paper › peer-review

Technology is a vital part of language revitalisation and conversation. While certain languages have usable speech-to-text (STT) models, this is not the case for most Celtic languages, including Breton. Audio transcription plays a crucial role in improving linguistic accessibility, improving online presence, and is used in a variety of fields. This paper seeks to use three open-source STT toolkits to train acoustic models to investigate how these can be used to create efficient STT models for Breton. This will be done using publicly available speech and text corpora. Given the low resources available for Breton, optimising several toolkits could allow for greater insight into how these different toolkits are able to perform with a low amount of resources. Acoustic models are in this case the base models that are used to convert spoken words into text. In addition, language models will be used to improve the accuracy of the acoustic models. The aim is to use this insight to produce and distribute usable and efficient STT models for Breton and make this technology accessible to the community.

Original language	English
Publication status	Published - Mar 2023
Event	Celtic Student Conference - University of Glasgow, Glasgow, United Kingdom Duration: 30 Mar 2023 → 1 Apr 2023 Conference number: 10

Conference

Conference	Celtic Student Conference
Country/Territory	United Kingdom
City	Glasgow
Period	30/03/23 → 1/04/23

View graph of relations

Research Portal

Speech-to-text for Breton

Conference