Skip to main content

Annotator combining different NLP pipelines

Project description

Automated annotation of natural languages using selected toolchains

Version License: MIT GitHub Workflow Status codecov Quality Gate Status Language

This project just had its first version release and is still under development.

Description

The nlpannotator package serves as modular toolchain to combine different natural language processing (nlp) tools to annotate texts (sentencizing, tokenization, part-of-speech (POS) and lemma).

Options

All input options are provided in an input dictionary. Two pre-set toolchains can be used: fast using spaCy for all annotations; accurate using SoMaJo for sentencizing and tokenization, and stanza for POS and lemma; and manual where any combination of spaCy, stanza, SoMaJo, Flair, Treetagger can be used, given the tool supports the selected annotation and language.

Installation

Install the project and its dependencies from PyPi:

pip install nlpannotator

The language models need to be installed separately. You can make use of the convenience script here which installs all language models for all languages that have been implemented for spaCy and stanza.

Usage

Take a look at the DemoNotebook or run it on Binder.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

nlpannotator-1.0.2.tar.gz (22.2 kB view details)

Uploaded Source

Built Distribution

nlpannotator-1.0.2-py3-none-any.whl (26.1 kB view details)

Uploaded Python 3

File details

Details for the file nlpannotator-1.0.2.tar.gz.

File metadata

  • Download URL: nlpannotator-1.0.2.tar.gz
  • Upload date:
  • Size: 22.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.9.7

File hashes

Hashes for nlpannotator-1.0.2.tar.gz
Algorithm Hash digest
SHA256 16406d8cfb63e0dd86de5dd17ee59074d797c2cf52825edc1853d56ab1c4f0ed
MD5 d6f2e98b5de34f78318f84f9e83cb405
BLAKE2b-256 f768f181c87def7b48b004718ecd15b79e089f83d10d0721baee2805726b6c53

See more details on using hashes here.

File details

Details for the file nlpannotator-1.0.2-py3-none-any.whl.

File metadata

File hashes

Hashes for nlpannotator-1.0.2-py3-none-any.whl
Algorithm Hash digest
SHA256 8569344e1b6d0e7b305e4c145a76b0db51019dba25b794c5bd31f3017eedca8d
MD5 586eeb8563e3d9dfaaa3665d58329ded
BLAKE2b-256 a620ee2c1f21fc58c9e2d246f846fa1f27bcff51dbfab351d3dd521fb371ae79

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page