Skip to main content

Collate textual sources with relaxed spelling.

Project description

py39 py310 py311 py312 pypy39 coverage

Collates textual sources with relaxed spelling. Uses Gotoh’s variant of the Needleman-Wunsch sequence alignment algorithm.

$ pip install super-collator
>>> from super_collator.aligner import Aligner
>>> from super_collator.ngrams import NGrams
>>> from super_collator.super_collator import to_table

>>> aligner = Aligner(-0.5, -0.5, -0.5)
>>> a = "Lorem ipsum dollar amat adipiscing elit"
>>> b = "qui dolorem ipsum quia dolor sit amet consectetur adipisci velit"
>>>
>>> a = [NGrams(s).load(s, 3) for s in a.split()]
>>> b = [NGrams(s).load(s, 3) for s in b.split()]
>>>
>>> a, b, score = aligner.align(a, b, NGrams.similarity, lambda: NGrams("-"))
>>> print(to_table(list(map(str, a)), list(map(str, b))))  # doctest: +NORMALIZE_WHITESPACE
-   Lorem   ipsum -    dollar -   amat -           adipiscing elit
qui dolorem ipsum quia dolor  sit amet consectetur adipisci   velit

Documentation: https://cceh.github.io/super-collator/

PyPi: https://pypi-hypernode.com/project/super-collator/

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

super_collator-0.0.5.tar.gz (37.1 kB view details)

Uploaded Source

Built Distribution

super_collator-0.0.5-py3-none-any.whl (19.7 kB view details)

Uploaded Python 3

File details

Details for the file super_collator-0.0.5.tar.gz.

File metadata

  • Download URL: super_collator-0.0.5.tar.gz
  • Upload date:
  • Size: 37.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.6

File hashes

Hashes for super_collator-0.0.5.tar.gz
Algorithm Hash digest
SHA256 e0306f48131d70ca7e26dff1b022a2737c0769352b0024e51a8e5c89d333b651
MD5 b0f2e0d8a278f374fb67453da7320291
BLAKE2b-256 817963e2dc885651154f9ecf165400655e749a3d9a208bd5fea006acdf74b0a3

See more details on using hashes here.

File details

Details for the file super_collator-0.0.5-py3-none-any.whl.

File metadata

File hashes

Hashes for super_collator-0.0.5-py3-none-any.whl
Algorithm Hash digest
SHA256 50b476f0c7980078c5bdc78491b3bf0d9573fc24d57fc2b30e309e1435bdc334
MD5 089c12a8887931d66d06990fd723ecb5
BLAKE2b-256 9681ea6abace7b271a04a5c47561f62006cbd2c952f8703e3ade00290bdd9143

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page