eSCAPE: a Large-scale Synthetic Corpus for Automatic Post-Editing – META-SHARE

Last view: 2026-07-08

95 Last view: 2026-07-08

eSCAPE: a Large-scale Synthetic Corpus for Automatic Post-Editing

eSCAPE

http://hltshare.fbk.eu/QT21/eSCAPE.html

Training of Automatic Post-editing models based on (source, reference and target)
Set of 7,258,533 English-German and 3,357,371 English-Italian triplets (source, target and reference). For each language pair, the data set is available in two versions: one, where the target segments are produced by a neural MT system and another, where the targets are obtained by a phrase-based MT system. The MT systems used to generate the targets are instances of the open-source Modern MT tool developed by the European project MMT. The original (source, reference) pairs are derived from a collection of corpora from different domains that is available in the OPUS repository.

You don’t have the permission to edit this resource.

DistributionAvailability

Available - Unrestricted Use

Licence

CC - ZERO

Download location: hidden

Distribution Access/Medium: Downloadable

Contact Person

Christian Dugast

text

1
2

Bilingual text corpusLanguages

English Italian

Linguality

Linguality type: Bilingual

Multi-linguality type: Parallel

Size

3,357,371 triplets (source, target and reference)

Bilingual text corpusLanguages

English German

Linguality

Linguality type: Bilingual

Multi-linguality type: Parallel

Size

7,258,533 triplets (source, target and reference)

Metadata

Created: 03/02/2018

Last Updated: 03/02/2018

Metadata Creator

Usage

Foreseen UseNlp Applications

Use NLP Specific: Machine Translation

Actual Use - Nlp Applications

Use NLP Specific: Machine Translation

People who looked at this resource also viewed the following: