Speech-to-text for Breton

Research output: Contribution to conferencePaperpeer-review

Standard Standard

Speech-to-text for Breton. / Vangberg, Preben; Farhat, Leena.
2023. Paper presented at Celtic Student Conference, Glasgow, United Kingdom.

Research output: Contribution to conferencePaperpeer-review

HarvardHarvard

Vangberg, P & Farhat, L 2023, 'Speech-to-text for Breton', Paper presented at Celtic Student Conference, Glasgow, United Kingdom, 30/03/23 - 1/04/23.

APA

Vangberg, P., & Farhat, L. (2023). Speech-to-text for Breton. Paper presented at Celtic Student Conference, Glasgow, United Kingdom.

CBE

Vangberg P, Farhat L. 2023. Speech-to-text for Breton. Paper presented at Celtic Student Conference, Glasgow, United Kingdom.

MLA

Vangberg, Preben and Leena Farhat Speech-to-text for Breton. Celtic Student Conference, 30 Mar 2023, Glasgow, United Kingdom, Paper, 2023.

VancouverVancouver

Vangberg P, Farhat L. Speech-to-text for Breton. 2023. Paper presented at Celtic Student Conference, Glasgow, United Kingdom.

Author

Vangberg, Preben ; Farhat, Leena. / Speech-to-text for Breton. Paper presented at Celtic Student Conference, Glasgow, United Kingdom.

RIS

TY - CONF

T1 - Speech-to-text for Breton

AU - Vangberg, Preben

AU - Farhat, Leena

N1 - Conference code: 10

PY - 2023/3

Y1 - 2023/3

N2 - Technology is a vital part of language revitalisation and conversation. While certain languages have usable speech-to-text (STT) models, this is not the case for most Celtic languages, including Breton. Audio transcription plays a crucial role in improving linguistic accessibility, improving online presence, and is used in a variety of fields. This paper seeks to use three open-source STT toolkits to train acoustic models to investigate how these can be used to create efficient STT models for Breton. This will be done using publicly available speech and text corpora. Given the low resources available for Breton, optimising several toolkits could allow for greater insight into how these different toolkits are able to perform with a low amount of resources. Acoustic models are in this case the base models that are used to convert spoken words into text. In addition, language models will be used to improve the accuracy of the acoustic models. The aim is to use this insight to produce and distribute usable and efficient STT models for Breton and make this technology accessible to the community.

AB - Technology is a vital part of language revitalisation and conversation. While certain languages have usable speech-to-text (STT) models, this is not the case for most Celtic languages, including Breton. Audio transcription plays a crucial role in improving linguistic accessibility, improving online presence, and is used in a variety of fields. This paper seeks to use three open-source STT toolkits to train acoustic models to investigate how these can be used to create efficient STT models for Breton. This will be done using publicly available speech and text corpora. Given the low resources available for Breton, optimising several toolkits could allow for greater insight into how these different toolkits are able to perform with a low amount of resources. Acoustic models are in this case the base models that are used to convert spoken words into text. In addition, language models will be used to improve the accuracy of the acoustic models. The aim is to use this insight to produce and distribute usable and efficient STT models for Breton and make this technology accessible to the community.

M3 - Paper

T2 - Celtic Student Conference

Y2 - 30 March 2023 through 1 April 2023

ER -