CINTIL-PropBank

163 Last view: 2026-07-27

The CINTIL-PropBank (Branco et al., 2012) is a set of sentences annotated with their constituency structure and semantic role tags, composed of 10,039 sentences and 110,166 tokens taken from different sources and domains: news (8,861 sentences; 101,430 tokens), and novels (399 sentences; 3,082 tokens). In addition, there are 779 sentences (5,654 tokens) used for regression testing of the computational grammar that supported the annotation of the corpus.
For the creation of this PropBank we adopted a semi-automatic analysis with a double-blind annotation followed by adjudication. The resulting dataset contains three information levels: phrase constituency, grammatical functions, and phrase semantic roles.
The main motivation behind the creation of this resource was to build a high quality data set with semantic information that could support the development of automatic semantic role labelers for Portuguese.

You don’t have the permission to edit this resource.

DistributionAvailability

Under Negotiation

Licence

Other

Licensors:

António Branco

Distribution rights holders:

António Branco

IPR Holder

University of Lisbon, Faculty of Sciences

Contact Person

António Branco

text

Monolingual text corpusLanguages

Portuguese (10,140 Sentences)

Linguality

Linguality type: Monolingual

Text Format

text/xml (10,140 Sentences)

Size

110,166 Tokens

10,039 Sentences

Character encoding

UTF - 8 (10,140 Sentences)

Domains

Novels (403 Sentences)

News (8,952 Sentences)

Test (785 Sentences)

Modalities

Written Language

Geographic coverage

Portugal (10,140 Sentences)

Estados Unidos da América (106 Sentences)

Creation

Creation mode: Mixed

Resource Creation

Resource Creator

António Branco

Funding Project

SemanticShare - Resources and Tools for Semantic Processing (SemanticShare - FCT/PTDC/PLP/81157/2006)

URL: http://nlx.di.fc.ul....

Funding Type: National Funds

Funder: FCT - Fundação para a Ciência e Tecnologia

Funding Country: Portugal

Project duration: 06/01/2006 - 12/31/2010

Metadata

Created: 06/01/2012

Last Updated: 12/11/2015

Source: METANET4U

META-SHARE

Metadata Language: english

Metadata Creator

Catarina Carvalheiro

Version

Version: 1

Last Updated: 06/01/2012

Usage

Foreseen UseNlp Applications

Use NLP Specific: Parsing, Semantic Role Labelling

Actual Use - Nlp Applications

Use NLP Specific: Parsing, Semantic Role Labelling

Documentation

Tool Documentation: Online

Samples Location: http://194.117.45.19...

Document Type: Other

Catarina Carvalheiro, CINTIL PropBank Narrative Description., http://194.117.45.19... , 2012

Document Type: In Proceedings

António, Branco; Catarina, Carvalheiro; Sílvia, Pereira; Mariana, Avelãs; Clara, Pinto; Sara, Silveira; Francisco, Costa; João, Silva; Sérgio, Castro , A PropBank for Portuguese: the CINTIL-PropBank , http://www.di.fc.ul.... , Proceedings of the Eight International Conference on Language Resources and Evaluation , 2012

Document Language: english

People who looked at this resource also viewed the following:

Resources from the same project

Resources from the same creators