Skip to main content

A Generalized Suffix Tree for any iterable, with Lowest Common Ancestor retrieval

Project description

py39 py310 py311 py312 pypy39 coverage

A Generalized Suffix Tree for any Python sequence, with Lowest Common Ancestor retrieval.

pip install suffix-tree
>>> from suffix_tree import Tree

>>> tree = Tree({"A": "xabxac"})
>>> tree.find("abx")
True
>>> tree.find("abc")
False

This suffix tree:

  • works with any Python sequence, not just strings, if the items are hashable,

  • is a generalized suffix tree for sets of sequences,

  • is implemented in pure Python,

  • builds the tree in time proportional to the length of the input,

  • does constant-time Lowest Common Ancestor retrieval.

Being implemented in Python this tree is not very fast nor memory efficient. The building of the tree takes time proportional to the length of the string of symbols. The query time is proportional to the length of the query string.

To get the best performance turn the python optimizer on: python -O.

Documentation: https://cceh.github.io/suffix-tree/

PyPi: https://pypi-hypernode.com/project/suffix-tree/

Usage examples:

>>> from suffix_tree import Tree
>>> tree = Tree()
>>> tree.add(1, "xabxac")
>>> tree.add(2, "awyawxawxz")
>>> tree.find("abx")
True
>>> tree.find("awx")
True
>>> tree.find("abc")
False
>>> tree = Tree({"A": "xabxac", "B": "awyawxawxz"})
>>> tree.find_id("A", "abx")
True
>>> tree.find_id("B", "abx")
False
>>> tree.find_id("B", "awx")
True
>>> tree = Tree(
...     {
...         "A": "sandollar",
...         "B": "sandlot",
...         "C": "handler",
...         "D": "grand",
...         "E": "pantry",
...     }
... )
>>> for k, length, path in tree.common_substrings():
...     print(k, length, path)
...
2 4 s a n d
3 3 a n d
4 3 a n d
5 2 a n
>>> tree = Tree({"A": "xabxac", "B": "awyawxawxz"})
>>> for C, path in sorted(tree.maximal_repeats()):
...     print(C, path)
...
1 a w
1 a w x
2 a
2 x
2 x a

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

suffix_tree-0.1.1.tar.gz (4.8 MB view details)

Uploaded Source

Built Distribution

suffix_tree-0.1.1-py3-none-any.whl (32.4 kB view details)

Uploaded Python 3

File details

Details for the file suffix_tree-0.1.1.tar.gz.

File metadata

  • Download URL: suffix_tree-0.1.1.tar.gz
  • Upload date:
  • Size: 4.8 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.6

File hashes

Hashes for suffix_tree-0.1.1.tar.gz
Algorithm Hash digest
SHA256 db88971ec16f65d78c7333a25d0af34805490b424d7977c8623b545c7faf400d
MD5 481bff1843d16a6f84e6ea878a267b7f
BLAKE2b-256 0378091c9dd14c2d30af452fada6286e8f329893b6498bb6c04dce5d5ca66266

See more details on using hashes here.

File details

Details for the file suffix_tree-0.1.1-py3-none-any.whl.

File metadata

  • Download URL: suffix_tree-0.1.1-py3-none-any.whl
  • Upload date:
  • Size: 32.4 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.6

File hashes

Hashes for suffix_tree-0.1.1-py3-none-any.whl
Algorithm Hash digest
SHA256 aa8e328459de53e5dacc3eaaca894da5df51c8dbd7758aeb63a60f8dbc6c4db3
MD5 57c9366d168d65c0ca6f6c56fcc21663
BLAKE2b-256 60d8e560f56de33c6ba0a1ff4a4c4f263ad4fc471208d60ce6df63d62b2a1480

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page