SUBTLEX-CY: A new word frequency database for Welsh
Allbwn ymchwil: Cyfraniad at gyfnodolyn › Erthygl › adolygiad gan gymheiriaid
Fersiynau electronig
Dogfennau
- van-heuven-et-al-2023-subtlex-cy-a-new-word-frequency-database-for-welsh
Fersiwn derfynol wedi’i chyhoeddi, 2.57 MB, dogfen-PDF
Trwydded: CC BY-NC Dangos trwydded
Dangosydd eitem ddigidol (DOI)
We present SUBTLEX-CY, a new word frequency database created from a 32-million-word corpus of Welsh television subtitles. An experiment comprising a lexical decision task examined SUBTLEX-CY frequency estimates against words with inconsistent frequencies in a much smaller Welsh corpus that is often used by researchers, the Cronfa Electroneg o’r Gymraeg (CEG), and three other Welsh word frequency databases. Words were selected that were classified as low frequency (LF) in SUBTLEX-CY and high frequency (HF) in CEG and compared with words that were classified as medium frequency (MF) in both SUBTLEX-CY and CEG. Reaction time analyses showed that HF words in CEG were responded to more slowly compared to MF words, suggesting that SUBTLEX-CY corpus provides a more reliable estimate of Welsh word frequencies. The new Welsh word frequency database that also includes part-of-speech, contextual diversity, and other lexical information is freely available for research purposes on the Open Science Framework repository at https://osf.io/9gkqm/.
Iaith wreiddiol | Saesneg |
---|---|
Tudalennau (o-i) | 1052–1067 |
Nifer y tudalennau | 16 |
Cyfnodolyn | Quarterly Journal of Experimental Psychology |
Cyfrol | 77 |
Rhif y cyfnodolyn | 5 |
Dyddiad ar-lein cynnar | 30 Awst 2023 |
Dynodwyr Gwrthrych Digidol (DOIs) | |
Statws | Cyhoeddwyd - Mai 2024 |
Cyfanswm lawlrlwytho
Nid oes data ar gael