PAROLE Irish Distributable Corpus
View resource name in all available languages
Corpus PAROLE irlandais
The PAROLE Irish Distributable Corpus consists of over 8 million words (a subset of the 15+ million words Irish Reference corpus).
The text is marked-up in accordance with the PAROLE encoding standard which incorporates the Corpus Encoding Standard (CES) and Text Encoding Initiative (TEI) Guidelines. All the files are in SGML format with a detailed header and the body of the text tagged to paragraph level. The header includes information such as title, author(s), number of words, ownership, publication details and also a standard coding for Medium, Topic and Genre categories.
A subset of the Distributable Corpus is morpho-syntactically tagged.
Included in this distribution is approximately 3,000 manually checked words.
View resource description in all available languages