The CINTIL-WordSenses corpus, built upon the CINTIL International Corpus of Portuguese (Barreto et al., 2006), is composed of 23,825 sentences of written Portuguese with open-class terms manually disambiguated and annotated with synset identifiers from the Portuguese MultiWordNet (MWNPT) (Pianti et al., 2002). From a total of 508,717 tokens of which 193,443 are open-class (potentially ambiguous) terms, 45,502 have been annotated with synset identifiers.

The development of the CINTIL-WordSenses corpus has been funded by the EU project QTLeap (EC/FP7/610516) and the Portuguese project DP4LT (PTDC/EEI-

You don’t have the permission to edit this resource.