Corpws Lleferydd Paldaruo Fersiwn 5 | Paldaruo Speech Corpus Version 5

Research output: Non-textual formData set/Database

Standard Standard

Corpws Lleferydd Paldaruo Fersiwn 5 | Paldaruo Speech Corpus Version 5. Cooper, Sarah (Other); Chan, David (Other); Jones, Dewi (Other). 2018. Prifysgol Bangor University.

Research output: Non-textual formData set/Database

HarvardHarvard

APA

CBE

MLA

VancouverVancouver

Author

RIS

TY - ADVS

T1 - Corpws Lleferydd Paldaruo Fersiwn 5 | Paldaruo Speech Corpus Version 5

A2 - Cooper, Sarah

A2 - Chan, David

A2 - Jones, Dewi

PY - 2018/12/19

Y1 - 2018/12/19

N2 - Cynlluniwyd Corpws Lleferydd Paldaruo i ddatblygu adnabod lleferydd awtomatig ar gyfer y Gymraeg. Mae angen llawer iawn o ddata i ddatblygu adnabod lleferydd ac fe gafodd y corpws ei gasglu drwy’r Ap Paldaruo – ap i gasglu data sain gan siaradwyr. Mae defnyddio ap i gasglu data gan siaradwyr Cymraeg yn golygu y gellir manteisio ar amrywiaeth siaradwyr, sy’n bwysig ar gyfer adnabod lleferydd. Mae torfoli yn cyfeirio at gael data gan nifer fawr o bobl, fel arfer dros y we.The Paldaruo Speech Corpus is a read speech corpus designed to develop automatic speech recognition for Welsh. A large amount of data is needed to develop speech recognition and the corpus was crowdsourced via the Ap Paldaruo – an app designed to collect audio data from speakers. Using an App to crowdsource data from speakers of Welsh means that speaker variation can be maximised, which is important for accurate speech recognition. Crowdsourcing refers to obtaining data from a large number of people, usually over the internet.

AB - Cynlluniwyd Corpws Lleferydd Paldaruo i ddatblygu adnabod lleferydd awtomatig ar gyfer y Gymraeg. Mae angen llawer iawn o ddata i ddatblygu adnabod lleferydd ac fe gafodd y corpws ei gasglu drwy’r Ap Paldaruo – ap i gasglu data sain gan siaradwyr. Mae defnyddio ap i gasglu data gan siaradwyr Cymraeg yn golygu y gellir manteisio ar amrywiaeth siaradwyr, sy’n bwysig ar gyfer adnabod lleferydd. Mae torfoli yn cyfeirio at gael data gan nifer fawr o bobl, fel arfer dros y we.The Paldaruo Speech Corpus is a read speech corpus designed to develop automatic speech recognition for Welsh. A large amount of data is needed to develop speech recognition and the corpus was crowdsourced via the Ap Paldaruo – an app designed to collect audio data from speakers. Using an App to crowdsource data from speakers of Welsh means that speaker variation can be maximised, which is important for accurate speech recognition. Crowdsourcing refers to obtaining data from a large number of people, usually over the internet.

M3 - Data set/Database

PB - Prifysgol Bangor University

ER -