Lists of Words Corpus (UHLCS)

View resource name in all available languages

Sanaluettelokorpus (UHLCS)


The corpus is available in Kielipankki - the Language Bank of Finland (, access rights instructions:

The lists of words located at the University of Helsinki Language Corpus Server were generated from the corpora of the following languages:

* Dutch: 178,430 words, 1,998,881 characters
* Finnish: proper names: 714 words, 4,488 characters; general list of words: 264,654 words, 3,171,148 characters
* French: 138,257 words, 1,524,757 characters
* German: 160,086 words, 2,060,734 characters
* Italian: 60,453 words, 561,982 characters
* Norwegian: 61,843 words, 589,234 characters
* Swedish: 13,328 words, 117,685 characters

Type of the documents: words in alphabetic order.
Character encoding: ASCII.

The lists of words were compiled at the University of Helsinki, Department of General Linguistics. The Lists of Words Corpus is a part of the UHLCS corpus collection.

UHLCS has many different IPR holders. Should you have any questions regarding the collection, please contact Pirkko Suihkonen (

License details:

Detailed information:

The purpose of the resource use must be outlined in a research plan.

You don’t have the permission to edit this resource.