Skip to main content

Annotator combining different NLP pipelines

Project description

Automated annotation of natural languages using selected toolchains

Version License: MIT GitHub Workflow Status codecov Quality Gate Status Language

This project just had its first version release and is still under development.

Description

The nlpannotator package serves as modular toolchain to combine different natural language processing (nlp) tools to annotate texts (sentencizing, tokenization, part-of-speech (POS) and lemma).

Options

All input options are provided in an input dictionary. Two pre-set toolchains can be used: fast using spaCy for all annotations; accurate using SoMaJo for sentencizing and tokenization, and stanza for POS and lemma; and manual where any combination of spaCy, stanza, SoMaJo, Flair, Treetagger can be used, given the tool supports the selected annotation and language.

Installation

Install the project and its dependencies from PyPi:

pip install nlpannotator

The language models need to be installed separately. You can make use of the convenience script here which installs all language models for all languages that have been implemented for spaCy and stanza.

Usage

Take a look at the DemoNotebook or run it on Binder.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

nlpannotator-1.0.1.tar.gz (22.2 kB view details)

Uploaded Source

Built Distribution

nlpannotator-1.0.1-py3-none-any.whl (26.1 kB view details)

Uploaded Python 3

File details

Details for the file nlpannotator-1.0.1.tar.gz.

File metadata

  • Download URL: nlpannotator-1.0.1.tar.gz
  • Upload date:
  • Size: 22.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.9.7

File hashes

Hashes for nlpannotator-1.0.1.tar.gz
Algorithm Hash digest
SHA256 f2ef4c01558e125a8c251eb60b75308cfe4575d0c7ba8b8255d39de37764f046
MD5 b8164533b11cf7c6a7c87d2e102d66a4
BLAKE2b-256 23eb9f6b28a1267c863f3e7defa74d3814d687869234fb015862188f484cb96c

See more details on using hashes here.

File details

Details for the file nlpannotator-1.0.1-py3-none-any.whl.

File metadata

File hashes

Hashes for nlpannotator-1.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 8fdb308180953ce764749537942763d312f1b9383bd32dd1751df363f563f4dd
MD5 fdc745496575155efa52a6e1627bee4c
BLAKE2b-256 be77876be1e4ecf3363c216503b97b8c7b6c575b6069db37487f938e94fde37c

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page