Modules

Reference Corpus of Slovene and Slovene Lexical Database, with Grammatical Annotation Tool

The goal of this module is to compile:

a reference corpus of 100 million words, which will include a spoken subcorpus;

a Slovene lexical database, which will contain information on lexicon features, such as frequency of occurrence, pronunciation, morphology and syntax, sense discrimination, phraseology, etc.

a grammatical annotation tool, which will consist of a lexicon of inflected forms, as well as a tagger and parser for Slovene text analysis