Constrained Conditional Random Fields Tagging Tool




Concraft is a statistical tool for morphosyntactic disambiguation developed as a part of the CESAR project. It is based on conditional random fields (CRFs) extended with additional, position-wise restrictions on the output domain, which are used to impose consistency between the modeled label sequences and morphosyntactic analysis results both at the level of decoding and, more importantly, in parameters estimation process. The problem of morphosyntactic disambiguation is decomposed into two consecutive stages of the context-sensitive morphosyntactic guessing and the disambiguation proper. The tool is currently adapted to the Polish language and resources, but the method and the library should be applicable to at least other highly inflected languages.

  • The Glasgow Haskell Compiler (version 7.0.4 or higher)