Libraries, frameworks and applications useful for developing applications.
Platforms and toolkits
- Haxe-linguistics – Early linguistical analysis and natural language processing library for Haxe.
- Natural – General natural language tools for Node.js.
- Natural Language ToolKit (NLTK) – The most complete platform for building Python programs to work with human language data.
- Snowball – Snowball is a language in which stemming algorithms can be easily represented.
- Spacy – Industrial-strength National Language Processing in Python.
- UralicNLP – An open source Python library for processing morphologically rich and, for the most part, endangered Uralic languages. It can do morphological analysis, generation, lemmatization, disambiguation and lexical lookup for a great many Uralic languages.
- Stemming algorithms for various European languages – Various stemming algorithms from snowball.
- The Porter Stemmer Algorithm – The ‘official’ home page for distribution of the Porter Stemming Algorithm, written and maintained by its author, Martin Porter.
- EuroRomCom Data – JSON formatted Pan-Romance word lists.
- How To Label Data – Guide on managing large scale linguistic annotation projects.
- Low Resource Languages – A list of resources for conservation, development, and documentation of low resource (human) languages.
- Bag of words model
- Document classification
- Language models
- Naive Bayes classification
- Natural language processing
- Outline of natural language processing
- Parts of speech tagging
- Sentiment analysis
- Term frequency – inverse document frequency
- Vector space model
- Computational Linguistics Lecture Playlist (Youtube) – Lectures for University of Maryland class on computational linguistics.
- The Virtual Linguistics Campus – CC-licensed educational videos interconnected with Marburg University’s e-learning platform of the same name.
Some of the more interesting and complete books.
- Essentials of Linguistics – An introductory book.
- Introduction to Linguistics
- Natural Language Processing with Python – The book from the NLTK package.
- Text Mining with R
- Foundations of Computational Linguistics
- Foundations of Statistical Natural Language Processing
- Semisupervised Learning for Computational Linguistics
- Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics and Speech Recognition
- The Oxford Handbook of Computational Linguistics