NLP Utils
Home made
Verb Flexer (fast-small verb flexer)
STLangId (Short Text Language Identifier - fast-small language guesser/categorizator useful for short texts)
Out of shelve
LingPipe: An open source Named Entity Recognizer (Java)
LingPipe Basic Kit
FreeLing: An open source lexical/sintactical analyzer and POS-tagger (EAGLES based, PAROLE tags) uses ISO-8859-1 encoding, if you got UTF-8 use iconv before and after calling Freeling (C++) (Spanish, Catalan and English) (install how to)(userman)
Freeling Basic Kit
TreeTagger: multi-lingual POS-Tagger/lematizer (English, German, French, Spanish,…)
MINIPAR: An open source Sintactical Dependency Analyzer (C++)
Minipar Basic Kit - (Mirror download)
WordNet: Lexical database of English, developed under the direction of George A. Miller. Nouns, verbs, adjectives and adverbs are grouped into sets of cognitive synonyms (synsets), each expressing a distinct concept. Synsets are interlinked by means of conceptual-semantic and lexical relations.
(Fellbaum book)
Wikipedia: An open source Web Encyclopedia
(How to use) (mirror download)
TextCat: An open source Language Categorizator (guesser/detector). Fast-Perl-70Langs (But not useful for similar languages and short texts) Sourceforge.