Speech-to-text for Breton

Allbwn ymchwil: Cyfraniad at gynhadleddPapuradolygiad gan gymheiriaid

Technology is a vital part of language revitalisation and conversation. While certain languages have usable speech-to-text (STT) models, this is not the case for most Celtic languages, including Breton. Audio transcription plays a crucial role in improving linguistic accessibility, improving online presence, and is used in a variety of fields. This paper seeks to use three open-source STT toolkits to train acoustic models to investigate how these can be used to create efficient STT models for Breton. This will be done using publicly available speech and text corpora. Given the low resources available for Breton, optimising several toolkits could allow for greater insight into how these different toolkits are able to perform with a low amount of resources. Acoustic models are in this case the base models that are used to convert spoken words into text. In addition, language models will be used to improve the accuracy of the acoustic models. The aim is to use this insight to produce and distribute usable and efficient STT models for Breton and make this technology accessible to the community.
Iaith wreiddiolSaesneg
StatwsCyhoeddwyd - Maw 2023
DigwyddiadCeltic Student Conference - University of Glasgow, Glasgow, Y Deyrnas Unedig
Hyd: 30 Maw 20231 Ebr 2023
Rhif y gynhadledd: 10

Cynhadledd

CynhadleddCeltic Student Conference
Gwlad/TiriogaethY Deyrnas Unedig
DinasGlasgow
Cyfnod30/03/231/04/23
Gweld graff cysylltiadau