Skip to main content

Algorithms and data structures for my Python projects

Project description

ddalg

Algorithms and data structures for my Python projects.

Build Status

Interval tree

Interval tree is a data structure for holding intervals and to allow efficiently find intervals that overlap with given interval or point. Read more on Wikipedia.

Implementation note

This implementation uses half-open intervals, where begin coordinate is excluded. Half-open intervals are used in e.g. BED genomic format.

The current implementation needs to rebuild the tree after each insert, hence the tree is not efficient for using in read/write fashion.

Usage

  • implement your custom interval object while extending Interval. Two properties need to be overwritten:
    • begin - 0-based (excluded) begin coordinate of the interval
    • end - 0-based (included) end coordinate of the interval
      from ddalg.model import Interval
      
      class YourInterval(Interval):
      
        def __init__(self, begin: int, end: int):
          self._begin = begin
          self._end = end
          # anything else
      
        @property
        def begin(self):
          return self._begin
      
        @property
        def end(self):
          return self._end
      
  • create a collection of your intervals and store them in the interval tree:
    from ddalg.itree import IntervalTree
     
    itree = IntervalTree([YourInterval(0, 3), YourInterval(1, 4)])
    # ... 
    
  • query itree:
    • using 1-based position:
      itree.search(1)
      

      returns (0,3)

    • using half-open interval coordinates:
      itree.get_overlaps(0, 1) 
      

      returns (0,3), effectively the same query as above

    • for intervals with minimal required coverage
      itree.fuzzy_query(0, 1, coverage=.90)
      

      return intervals with >=.9 overlap with respect to query coordinates

    • for intervals with minimal jaccard index
      itree.jaccard_query(0, 1, min_jaccard=.90)
      

      return intervals having jaccard_index>=.9 with respect to query coordinates

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

ddalg-0.0.3.post0.tar.gz (8.6 kB view details)

Uploaded Source

Built Distribution

ddalg-0.0.3.post0-py3.6.egg (27.1 kB view details)

Uploaded Source

File details

Details for the file ddalg-0.0.3.post0.tar.gz.

File metadata

  • Download URL: ddalg-0.0.3.post0.tar.gz
  • Upload date:
  • Size: 8.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/46.1.1.post20200323 requests-toolbelt/0.9.1 tqdm/4.43.0 CPython/3.7.3

File hashes

Hashes for ddalg-0.0.3.post0.tar.gz
Algorithm Hash digest
SHA256 845b57d46225d583e830a8dcf43975047a2fc8eb383dd30950623d3c4cd9d4a3
MD5 7e3b06c84b4b6b80ecb4a3bb0581bfc8
BLAKE2b-256 c7a3eaea90e782baedb5be91839f97308bb2ea41591ca0d38d34537f92fce284

See more details on using hashes here.

File details

Details for the file ddalg-0.0.3.post0-py3.6.egg.

File metadata

  • Download URL: ddalg-0.0.3.post0-py3.6.egg
  • Upload date:
  • Size: 27.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.1.1 pkginfo/1.5.0.1 requests/2.23.0 setuptools/46.1.1.post20200323 requests-toolbelt/0.9.1 tqdm/4.43.0 CPython/3.7.3

File hashes

Hashes for ddalg-0.0.3.post0-py3.6.egg
Algorithm Hash digest
SHA256 f548a3600ef4a4924cdf59a27b0049639881af618f9b9349df5ecee1a3d3389a
MD5 cba08ae69b280f69454ed290b07a0cbc
BLAKE2b-256 a21041071dc7a9e8c93cfaf955cab698e804ad43aafae1a938f3968d795a3e9c

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page