Evaluation of Three Welsh Language POS Taggers

Gruff Prys; Gareth Watkins

Evaluation of Three Welsh Language POS Taggers

Research output: Chapter in Book/Report/Conference proceeding › Chapter › peer-review

Standard Standard

Evaluation of Three Welsh Language POS Taggers. / Prys, Gruff ; Watkins, Gareth.
Proceedings of the CLTW 4 @ LREC2022. European Language Resources Association (ELRA), 2022. p. 30-39.

Research output: Chapter in Book/Report/Conference proceeding › Chapter › peer-review

HarvardHarvard

Prys, G & Watkins, G 2022, Evaluation of Three Welsh Language POS Taggers. in Proceedings of the CLTW 4 @ LREC2022. European Language Resources Association (ELRA), pp. 30-39. <http://www.lrec-conf.org/proceedings/lrec2022/workshops/CLTW4/pdf/2022.cltw4-1.5.pdf>

APA

Prys, G., & Watkins, G. (2022). Evaluation of Three Welsh Language POS Taggers. In Proceedings of the CLTW 4 @ LREC2022 (pp. 30-39). European Language Resources Association (ELRA). http://www.lrec-conf.org/proceedings/lrec2022/workshops/CLTW4/pdf/2022.cltw4-1.5.pdf

CBE

Prys G , Watkins G. 2022. Evaluation of Three Welsh Language POS Taggers. In Proceedings of the CLTW 4 @ LREC2022. European Language Resources Association (ELRA). pp. 30-39.

MLA

Prys, Gruff and Gareth Watkins "Evaluation of Three Welsh Language POS Taggers". Proceedings of the CLTW 4 @ LREC2022. European Language Resources Association (ELRA). 2022, 30-39.

VancouverVancouver

Prys G , Watkins G. Evaluation of Three Welsh Language POS Taggers. In Proceedings of the CLTW 4 @ LREC2022. European Language Resources Association (ELRA). 2022. p. 30-39

Author

Prys, Gruff ; Watkins, Gareth. / Evaluation of Three Welsh Language POS Taggers. Proceedings of the CLTW 4 @ LREC2022. European Language Resources Association (ELRA), 2022. pp. 30-39

RIS

TY - CHAP

T1 - Evaluation of Three Welsh Language POS Taggers

AU - Prys, Gruff

AU - Watkins, Gareth

PY - 2022/7/25

Y1 - 2022/7/25

N2 - In this paper we describe our quantitative and qualitative evaluation of three Welsh language Part of Speech (POS) taggers. Following an introductory section, we explore some of the issues which face POS taggers, discuss the state of the art in English language tagging, and describe the three Welsh language POS taggers that will be evaluated in this paper, namely WNLT2, CyTag and TagTeg. We then describe the challenges involved in evaluating POS taggers which make use of different tagsets, and introduce our mapping of the taggers’ individual tagsets to an Intermediate Tagset used to facilitate their comparative evaluation. We introduce our benchmarking corpus as an important component of our methodology, before describing how the inconsistencies in text tokenization between the different taggers present an issue when undertaking such evaluations, and discuss the method used to overcome this complication. We proceed to illustrate how we annotated the benchmark corpus, then describe the scoring method used. We provide an in-depth analysis of the results followed by a summary of the work.

AB - In this paper we describe our quantitative and qualitative evaluation of three Welsh language Part of Speech (POS) taggers. Following an introductory section, we explore some of the issues which face POS taggers, discuss the state of the art in English language tagging, and describe the three Welsh language POS taggers that will be evaluated in this paper, namely WNLT2, CyTag and TagTeg. We then describe the challenges involved in evaluating POS taggers which make use of different tagsets, and introduce our mapping of the taggers’ individual tagsets to an Intermediate Tagset used to facilitate their comparative evaluation. We introduce our benchmarking corpus as an important component of our methodology, before describing how the inconsistencies in text tokenization between the different taggers present an issue when undertaking such evaluations, and discuss the method used to overcome this complication. We proceed to illustrate how we annotated the benchmark corpus, then describe the scoring method used. We provide an in-depth analysis of the results followed by a summary of the work.

M3 - Chapter

SP - 30

EP - 39

BT - Proceedings of the CLTW 4 @ LREC2022

PB - European Language Resources Association (ELRA)

ER -

Research Portal

Evaluation of Three Welsh Language POS Taggers

Standard Standard

HarvardHarvard

APA

CBE

MLA

VancouverVancouver

Author

RIS