Skip to main content

Micro-library to normalize text strings

Project description

normality

build

Normality is a Python micro-package that contains a small set of text normalization functions for easier re-use. These functions accept a snippet of unicode or utf-8 encoded text and remove various classes of characters, such as diacritics, punctuation etc. This is useful as a preparation to further text analysis.

WARNING: This library works much better when used in combination with pyicu, a Python binding for the International Components for Unicode C library. ICU provides much better text transliteration than the default text-unidecode.

Example

# coding: utf-8
from normality import normalize, slugify, collapse_spaces

text = normalize('Nie wieder "Grüne Süppchen" kochen!')
assert text == 'nie wieder grune suppchen kochen'

slug = slugify('My first blog post!')
assert slug == 'my-first-blog-post'

text = 'this \n\n\r\nhas\tlots of \nodd spacing.'
assert collapse_spaces(text) == 'this has lots of odd spacing.'

License

normality is open source, licensed under a standard MIT license (included in this repository as LICENSE).

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

normality-2.3.2.tar.gz (10.7 kB view details)

Uploaded Source

Built Distribution

normality-2.3.2-py2.py3-none-any.whl (12.8 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file normality-2.3.2.tar.gz.

File metadata

  • Download URL: normality-2.3.2.tar.gz
  • Upload date:
  • Size: 10.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/34.0 requests/2.27.1 requests-toolbelt/0.9.1 urllib3/1.26.9 tqdm/4.63.1 importlib-metadata/4.11.3 keyring/23.5.0 rfc3986/2.0.0 colorama/0.4.4 CPython/3.9.11

File hashes

Hashes for normality-2.3.2.tar.gz
Algorithm Hash digest
SHA256 e0a23c6caf1831fe790061be242b91b7b4f19404085215c1272656d9b02116c4
MD5 ae7e0ba4932459936b6d699ce7e83525
BLAKE2b-256 25f7bb33ae0e389b9eee41e77d86264304fc04bfefa9fec45d5d0a6fe6f07f2f

See more details on using hashes here.

File details

Details for the file normality-2.3.2-py2.py3-none-any.whl.

File metadata

  • Download URL: normality-2.3.2-py2.py3-none-any.whl
  • Upload date:
  • Size: 12.8 kB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.8.0 pkginfo/1.8.2 readme-renderer/34.0 requests/2.27.1 requests-toolbelt/0.9.1 urllib3/1.26.9 tqdm/4.63.1 importlib-metadata/4.11.3 keyring/23.5.0 rfc3986/2.0.0 colorama/0.4.4 CPython/3.9.11

File hashes

Hashes for normality-2.3.2-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 111335a42ef8106ab0f8e769290bdc13d2c201926bc031cb6f0564c4877bf406
MD5 8054e21635e0dab952a5b55c6b2232df
BLAKE2b-256 05d3c9b1ab4d474947b06cbc4dfc3e9ed412ed9366a6a0d094b0012b30ac4cf6

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page