Filter by:
Estonian (18)
English (16)
German (14)
Finnish (11)
French (11)
Swedish (11)
Italian (9)
Lithuanian (9)
Latvian (8)
Romanian (8)
Danish (7)
Hungarian (7)
Polish (7)
Portuguese (7)
Russian (7)
Slovenian (7)
Croatian (6)
Czech (6)
Bulgarian (5)
Greek (5)
Norwegian (5)
Slovak (5)
Spanish (5)
Dutch (4)
Maltese (4)
Turkish (4)
Dutch; Flemish (3)
Icelandic (3)
Albanian (2)
Arabic (2)
Chinese (2)
Irish (2)
Japanese (2)
Latin (2)
Northern Sami (2)
Serbian (2)
Tajik (2)
Ukrainian (2)
Uzbek (2)
Armenian (1)
Avaric (1)
Azerbaijani (1)
Basque (1)
Belarussian (1)
Bosnian (1)
Chukchi (1)
Chuvash (1)
Erzya (1)
Even (1)
Evenki (1)
Gaelic (1)
Galician (1)
Georgian (1)
Hebrew (1)
Ingrian (1)
Ingrian Finnish (1)
Ingrian Finnish (1)
Kalmyk; Oirat (1)
Kazakh (1)
Khanty (1)
Kildin Sami (1)
Kirghiz; Kyrgyz (1)
Komi Zyrian (1)
Koryak (1)
Kurdish (1)
Lak (1)
Ludian (1)
Macedonian (1)
Malay (1)
Moksha (1)
Mongolian (1)
Nanai (1)
Sign Languages (1)
Slovene (1)
Tabassaran (1)
Tatar (1)
Tundra Nenets (1)
Turkmen (1)
Udmurt (1)
Ume Sami (1)
Multilingual (18)
Bilingual (1)
True (4)
Nlp Applications (7)
Human Use (3)
Text Mining (1)
Resource Type:
Corpus: | |
Lexical/Conceptual: | |
Tool/Service: | |
Language Description: |
Media Type:
Text: | |
Audio: | |
Image: | |
Video: | |
Text Numerical: | |
Text N-Gram: |
18 Language Resources
Order by:
ACCURAT balanced test corpus for under resourced languages
0
277
- Croatian
- English
- Estonian
- German
- Greek
- Latvian
- Lithuanian
- Romanian
- Slovenian
ACCURAT corpus of comparable sentences
0
259
- Croatian
- English
- Estonian
- German
- Greek
- Latvian
- Lithuanian
- Romanian
- Slovenian
ACCURAT corpus of Wikipedia texts
0
271
- Croatian
- English
- Estonian
- German
- Greek
- Latvian
- Lithuanian
- Romanian
- Slovenian
Bulgarian-X language Parallel Corpus
0
263
- Albanian
- Arabic
- Armenian
- Azerbaijani
- Basque
- Bosnian
- Bulgarian
- Catalan; Valencian
- Chinese
- Croatian
- Czech
- Danish
- Dutch
- English
- Estonian
- Finnish
- French
- Galician
- Georgian
- German
- Greek
- Hebrew
- Hungarian
- Icelandic
- Irish
- Italian
- Japanese
- Kazakh
- Kirghiz; Kyrgyz
- Latvian
- Lithuanian
- Macedonian
- Maltese
- Mongolian
- Norwegian
- Polish
- Portuguese
- Romanian
- Russian
- Serbian
- Slovak
- Slovene
- Spanish
- Swedish
- Tajik
- Turkish
- Turkmen
- Ukrainian
ECI/MCI (European Corpus Initiative/Multilingual Corpus I)
0
318
- Albanian
- Bulgarian
- Chinese
- Czech
- Danish
- Dutch
- English
- Estonian
- French
- Gaelic
- German
- Greek, Modern (1453-)
- Italian
- Japanese
- Latin
- Lithuanian
- Malay
- Norwegian
- Portuguese
- Russian
- Serbian
- Spanish
- Swedish
- Turkish
- Uzbek
English-Estonian cross-linked collection of comparable sentences from Wikipedia
0
99
- English
- Estonian
Europarl Parallel Corpus
0
188
- Bulgarian
- Czech
- Danish
- Dutch; Flemish
- English
- Estonian
- Finnish
- French
- German
- Greek, Modern (1453-)
- Hungarian
- Italian
- Latvian
- Lithuanian
- Polish
- Portuguese
- Romanian
- Slovak
- Slovenian
- Spanish
- Swedish
Information in Sign Language on the Tasks of the Parliamentary Ombudsman of Finland
0
154
- English
- Estonian
- Finnish
- French
- German
- Northern Sami
- Sign Languages
- Swedish
JRC-Acquis Multilingual Parallel Corpus
0
172
- Bulgarian
- Czech
- Danish
- Dutch; Flemish
- English
- Estonian
- Finnish
- French
- German
- Greek, Modern (1453-)
- Hungarian
- Italian
- Latvian
- Lithuanian
- Maltese
- Polish
- Portuguese
- Romanian
- Slovak
- Slovenian
- Spanish
- Swedish
Multilingual Resource Collection of the University of Helsinki Language Corpus Server
0
159
- Avaric
- Chukchi
- Chuvash
- Dutch
- English
- Erzya
- Estonian
- Even
- Evenki
- Finnish
- French
- German
- Ingrian
- Italian
- Kalmyk; Oirat
- Khanty
- Kildin Sami
- Komi Zyrian
- Koryak
- Kurdish
- Lak
- Latin
- Ludian
- Moksha
- Nanai
- Northern Sami
- Norwegian
- Ossetian; Ossetic
- Russian
- Swedish
- Tabassaran
- Tajik
- Tatar
- Tundra Nenets
- Udmurt
- Ume Sami
- Uzbek
Opus, Helsinki Korp Version
0
128
- Czech
- Danish
- English
- Estonian
- Finnish
- French
- German
- Greek, Modern (1453-)
- Hungarian
- Italian
- Polish
- Portuguese
- Russian
- Spanish; Castilian
- Swedish
- Turkish
PELCRA mutlilingual parallel corpora (CC-BY)
0
236
- Arabic
- Belarussian
- Bulgarian
- Croatian
- Czech
- Danish
- Dutch
- English
- Estonian
- Finnish
- French
- German
- Greek
- Hungarian
- Icelandic
- Irish
- Italian
- Latvian
- Lithuanian
- Maltese
- Norwegian
- Polish
- Portuguese
- Romanian
- Russian
- Slovak
- Slovenian
- Spanish
- Swedish
- Turkish
- Ukrainian
The Helsinki Korp Europarl Bilingual Corpora
0
89
- English
- Estonian
- Finnish
- French
- German
- Spanish; Castilian
- Swedish
The Helsinki Korp JRC-Acquis Bilingual Parallel Corpora
0
62
- English
- Estonian
- Finnish
- French
- German
- Hungarian
- Italian
- Polish
- Spanish; Castilian
- Swedish
The Long Second Corpus: LONGitudinal Classroom Data about Children’s Development in Finnish as a SECOND Language
0
99
- English
- Estonian
- Finnish
- Russian
Tilde MODEL - Multilingual Open Data for EU Languages
0
86
- Croatian
- Danish
- Dutch; Flemish
- English
- Estonian
- Finnish
- French
- German
- Greek, Modern (1453-)
- Hungarian
- Icelandic
- Italian
- Latvian
- Lithuanian
- Maltese
- Norwegian
- Polish
- Portuguese
- Romanian
- Slovak
- Slovenian
- Spanish; Castilian
- Swedish