Skip to main content

NLP, before and after spaCy

Project description

textacy: NLP, before and after spaCy

textacy is a Python library for performing a variety of natural language processing (NLP) tasks, built on the high-performance spaCy library. With the fundamentals --- tokenization, part-of-speech tagging, dependency parsing, etc. --- delegated to another library, textacy focuses primarily on the tasks that come before and follow after.

build status current release version pypi version conda version

Features

  • Convenient entry points to working with one or many documents processed by spaCy, with functionality added via custom extensions
  • Variety of downloadable datasets with both text content and metadata, from Congressional speeches to historical literature to Reddit comments
  • Easy file I/O for streaming data to and from disk
  • Cleaning, normalization, and exploration of raw text — before processing
  • Flexible extraction of words, ngrams, noun chunks, entities, acronyms, key terms, and other elements of interest
  • Tokenization and vectorization of documents, with functionality for training, interpreting, and visualizing topic models
  • String, set, and document similarity comparison by a variety of metrics
  • Calculations for common text statistics, including Flesch-Kincaid Grade Level and multilingual Flesch Reading Ease

... and more!

Links

Maintainer

Howdy, y'all. 👋

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

textacy-0.8.0.tar.gz (188.6 kB view details)

Uploaded Source

Built Distribution

textacy-0.8.0-py2.py3-none-any.whl (186.1 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file textacy-0.8.0.tar.gz.

File metadata

  • Download URL: textacy-0.8.0.tar.gz
  • Upload date:
  • Size: 188.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.21.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.31.1 CPython/3.7.0

File hashes

Hashes for textacy-0.8.0.tar.gz
Algorithm Hash digest
SHA256 959db2b6c05ed7b7ebdc0b8e87675d43a85d3514dc0495605ffa6ac413f77bbc
MD5 27c9a7d3e86373395d69d43d5f1f5382
BLAKE2b-256 e0952d50dc12968a06c88ab803ea1f3f0e87f2e9a1674e54badeb19a467a92ab

See more details on using hashes here.

File details

Details for the file textacy-0.8.0-py2.py3-none-any.whl.

File metadata

  • Download URL: textacy-0.8.0-py2.py3-none-any.whl
  • Upload date:
  • Size: 186.1 kB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.21.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.31.1 CPython/3.7.0

File hashes

Hashes for textacy-0.8.0-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 54e447a33c9437cdc795e900e97406d4244d2034abe4b88d3b20b314d6178bb5
MD5 409abe07fc9f4396cd39af1b232d7a5e
BLAKE2b-256 9d67787a815029800ef125a0aaa4a441f489b5b958818a705862642d6791388a

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page