Skip to main content

Micro-library to normalize text strings

Project description

normality text cleanup

build

Normality is a Python micro-package that contains a small set of text normalization functions for easier re-use. These functions accept a snippet of unicode or utf-8 encoded text and remove various classes of characters, such as diacritics, punctuation etc. This is useful as a preparation to further text analysis.

WARNING: This library works much better when used in combination with pyicu, a Python binding for the International Components for Unicode C library. ICU provides much better text transliteration than the default text-unidecode.

Example

# coding: utf-8
from normality import normalize, slugify, collapse_spaces

text = normalize('Nie wieder "Grüne Süppchen" kochen!')
assert text == 'nie wieder grune suppchen kochen'

slug = slugify('My first blog post!')
assert slug == 'my-first-blog-post'

text = 'this \n\n\r\nhas\tlots of \nodd spacing.'
assert collapse_spaces(text) == 'this has lots of odd spacing.'

License

normality is open source, licensed under a standard MIT license (included in this repository as LICENSE).

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

normality-2.5.0.tar.gz (17.9 kB view details)

Uploaded Source

Built Distribution

normality-2.5.0-py2.py3-none-any.whl (16.5 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file normality-2.5.0.tar.gz.

File metadata

  • Download URL: normality-2.5.0.tar.gz
  • Upload date:
  • Size: 17.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.18

File hashes

Hashes for normality-2.5.0.tar.gz
Algorithm Hash digest
SHA256 a55133e972b81c4a3bf8b6dc419f262f94a4fd6f636297046f74d35c93abe153
MD5 12f8652756c93117af3c32e54d9747be
BLAKE2b-256 e0126452229afa2331de60fe93324dd9e2eb6034cb2e2faf6867419d9c51d356

See more details on using hashes here.

File details

Details for the file normality-2.5.0-py2.py3-none-any.whl.

File metadata

  • Download URL: normality-2.5.0-py2.py3-none-any.whl
  • Upload date:
  • Size: 16.5 kB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.18

File hashes

Hashes for normality-2.5.0-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 d9f48daf32e351e88b9e372787c1da437df9d0d818aec6e2834b02102378df62
MD5 b3cb05cd990e13d1ba497cfe8d6f85a8
BLAKE2b-256 ae29cdd620678624e76de4034d1d69eb978cae4a96983dde963586f711261196

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page