Python natural language analysis library – Stanza, developed by the Stanford Natural Language Processing (NLP) group. It is a collection of NLP tools that is used to create neural network pipelines for text analysis. Stanza supports functionalities like tokenization, multi-word token expansion, lemmatization, part-of-speech (POS), morphological features tagging, dependency parsing, Named Entity Recognition(NER), and sentiment analysis. Using Universal Dependencies, Stanza provides consistent annotations of grammar in over 60 human languages.
Key Stanza features
- Native Python implementation with minimal set up requirement
- Complete neural network pipeline for robust text analytics, including tokenization, Multi-Word Token (MWT) expansion, lemmatization, part-of-speech (POS) and morphological features like tagging, dependency parsing, and named entity recognition
- Neural models supporting 66 (human) languages
- A stable officially maintained Python interface to CoreNLP