Skip to main content

Python bindings to the UCSC source for Big Binary Indexed (bigWig/bigBed) files.

Project description

pybbi

Build Status DOI

Python interface to Jim Kent's Big Binary Indexed file (BBI) [1] library from the UCSC Genome Browser source tree using Cython.

This provides read-level access to local and remote bigWig and bigBed files but no write capabilitites. The main feature is fast retrieval of range queries into numpy arrays.

Installation

Wheels for pybbi are available on PyPI for Python 3.8, 3.9, 3.10, 3.11 on Linux (x86_64 and aarch64) and Mac OSX (x86_64/Intel). Apple Silicon (arm64) wheels will be made available once M1 runners are available in GitHub Actions.

$ pip install pybbi

API

The bbi.open function returns a BBIFile object.

bbi.open(path) -> BBIFile

path can be a local file path (bigWig or bigBed) or a URL. BBIFile objects are context managers and can be used in a with statement to clean up resources without calling BBIFile.close().

>>> with bbi.open('bigWigExample.bw') as f:
...     x = f.fetch('chr21', 1000000, 2000000, bins=40)

Introspection

BBIFile.is_bigwig -> bool
BBIFile.is_bigbed -> bool
BBIFile.chromsizes -> OrderedDict
BBIFile.zooms -> list
BBIFile.info -> dict
BBIFile.schema -> dict
BBIFile.read_autosql() -> str

Note: BBIFile.schema['dtypes'] provides numpy data types for the fields in a bigWig or bigBed (matched from the autoSql definition).

Interval output

The actual interval records in a bigWig or bigBed can be retrieved as a pandas dataframe or as an iterator over records as tuples. The pandas output is parsed according to the file's schema.

BBIFile.fetch_intervals(chrom, start, end) -> pandas.DataFrame
BBIFile.fetch_intervals(chrom, start, end, iterator=True) -> interval iterator

Summary bin records at each zoom level are also accessible.

BBIFile.fetch_summaries(chrom, start, end, zoom) -> pandas.DataFrame

Array output

Retrieve quantitative signal as an array. The signal of a bigWig file is obtained from its "value" field. The signal of a bigBed file is obtained from the genomic coverage of its intervals.

For a single range query:

BBIFile.fetch(chrom, start, end, [bins [, missing [, oob, [, summary]]]]) -> 1D numpy array

To produce a stacked heatmap from a list of (1) equal-length intervals or (2) arbitrary-length intervals with bins specified:

BBIFile.stackup(chroms, starts, ends, [bins [, missing [, oob, [, summary]]]]) -> 2D numpy array
  • Summary querying is supported by specifying the number of bins for coarsening. The summary statistic can be one of: 'mean', 'min', 'max', 'cov', 'std', 'or 'sum'. (default = 'mean'). Intervals need not have the same length, in which case the data from each interval will be interpolated to the same number of bins (e.g., gene bodies).

  • Missing data can be filled with a custom fill value, missing (default = 0).

  • Out-of-bounds ranges (i.e. start less than zero or end greater than the chromosome length) are permitted because of their utility e.g., for generating vertical heatmap stacks centered at specific genomic features. A separate custom fill value, oob can be provided for out-of-bounds positions (default = NaN).

Function API

The original function-based API is still available:

bbi.is_bbi(path: str) -> bool
bbi.is_bigwig(path: str) -> bool
bbi.is_bigbed(path:str) -> bool
bbi.chromsizes(path: str) -> OrderedDict
bbi.zooms(path: str) -> list
bbi.info(path: str) -> dict
bbi.fetch_intervals(path: str, chrom: str, start: int, end: int, iterator: bool) -> Union[Iterable, pd.DataFrame]
bbi.fetch(path: str, chrom: str, start: int, end: int, [bins: int [, missing: float [, oob: float, [, summary: str]]]]) -> np.array[1, 'float64']
bbi.stackup(path: str, chroms: np.array, starts: np.array, ends: np.array, [bins: int [, missing: float [, oob: float, [, summary: str]]]]) -> np.array[2, 'float64']

See the docstrings for complete documentation.

Related projects

  • libBigWig: Alternative C library for bigWig and bigBed files by Devon Ryan
  • pyBigWig: Python bindings for libBigWig by the same author
  • bw-python: Alternative Python wrapper to libBigWig by Brent Pederson
  • bx-python: Python bioinformatics library from James Taylor's group that includes tools for bbi files.

This library provides bindings to the reference UCSC bbi library code. Check out @dpryan79's libBigWig for an alternative and dedicated C library for big binary files. pyBigWig also provides numpy-based retrieval and bigBed support.

References

[1]: http://bioinformatics.oxfordjournals.org/content/26/17/2204.full

From source

If wheels for your platform or Python version aren't available or you want to develop, you'll need to install pybbi from source. The source distribution on PyPI ships with (slightly modified) kent utils source, which will compile before the extension module is built.

Requires

  • Platform: Linux or Darwin (Windows Subsystem for Linux seems to work too)
  • pthreads, zlib, libpng, openssl, make, pkg-config
  • Python 3.6+
  • numpy and cython

For example, on a fresh Ubuntu instance, you'll need build-essential, make, pkg-config, zlib1g-dev, libssl-dev, libpng16-dev.

On a Centos/RedHat (rpm) system you'll need gcc, make, pkg-config, zlib-devel, openssl-devel, libpng-devel.

On a Mac, you'll need Xcode and to brew install pkg-config openssl libpng.

For development, clone the repo and install in editable mode:

$ git clone https://github.com/nvictus/pybbi.git
$ cd pybbi
$ pip install -e .

You can use the ARCH environment variable to specify a target architecture or ARCHFLAGS on a Mac.

Notes

Unfortunately, Kent's C source is not well-behaved library code, as it is littered with error calls that call exit(). pybbi will catch and pre-empt common input errors, but if somehow an internal error does get raised, it will terminate your interpreter instance.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pybbi-0.4.0.tar.gz (34.7 MB view details)

Uploaded Source

Built Distributions

pybbi-0.4.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (3.4 MB view details)

Uploaded CPython 3.11 manylinux: glibc 2.17+ x86-64

pybbi-0.4.0-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (3.1 MB view details)

Uploaded CPython 3.11 manylinux: glibc 2.17+ ARM64

pybbi-0.4.0-cp311-cp311-macosx_11_0_arm64.whl (3.0 MB view details)

Uploaded CPython 3.11 macOS 11.0+ ARM64

pybbi-0.4.0-cp311-cp311-macosx_10_9_x86_64.whl (2.7 MB view details)

Uploaded CPython 3.11 macOS 10.9+ x86-64

pybbi-0.4.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (3.4 MB view details)

Uploaded CPython 3.10 manylinux: glibc 2.17+ x86-64

pybbi-0.4.0-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (3.1 MB view details)

Uploaded CPython 3.10 manylinux: glibc 2.17+ ARM64

pybbi-0.4.0-cp310-cp310-macosx_11_0_arm64.whl (3.0 MB view details)

Uploaded CPython 3.10 macOS 11.0+ ARM64

pybbi-0.4.0-cp310-cp310-macosx_10_9_x86_64.whl (2.7 MB view details)

Uploaded CPython 3.10 macOS 10.9+ x86-64

pybbi-0.4.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (3.4 MB view details)

Uploaded CPython 3.9 manylinux: glibc 2.17+ x86-64

pybbi-0.4.0-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (3.1 MB view details)

Uploaded CPython 3.9 manylinux: glibc 2.17+ ARM64

pybbi-0.4.0-cp39-cp39-macosx_11_0_arm64.whl (3.0 MB view details)

Uploaded CPython 3.9 macOS 11.0+ ARM64

pybbi-0.4.0-cp39-cp39-macosx_10_9_x86_64.whl (2.7 MB view details)

Uploaded CPython 3.9 macOS 10.9+ x86-64

File details

Details for the file pybbi-0.4.0.tar.gz.

File metadata

  • Download URL: pybbi-0.4.0.tar.gz
  • Upload date:
  • Size: 34.7 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.9.19

File hashes

Hashes for pybbi-0.4.0.tar.gz
Algorithm Hash digest
SHA256 afafbc23f16f789344a454fce619af5556a67a2e11b7c36de8fc6f091e2478fc
MD5 a9ae4e734cbff584986b0d1e21368a3a
BLAKE2b-256 2dad88f09f16490fdd24c48693bac891e788a31d6f0aa2955f8021dfc8df8a48

See more details on using hashes here.

File details

Details for the file pybbi-0.4.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for pybbi-0.4.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 c76c35064825285024f38e132ea712a6da6b77284cb887ca8a8778e6774e7676
MD5 13ad9484c70dbf15462ee56f75fa5dd5
BLAKE2b-256 37c1d01a39b9444b10c814f3b5f055210331f7f4a601dd54d5510664823fe064

See more details on using hashes here.

File details

Details for the file pybbi-0.4.0-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for pybbi-0.4.0-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 7274cb6700e1575036bbebc4718043057f534dd391a51eaab377f32c5b130c94
MD5 8306e709e3776ee338478ac37518d5e1
BLAKE2b-256 34a9c7625a5e9b32ab92d66c8f76a4d3fb0af0f7314bc8d5d039993492e438d8

See more details on using hashes here.

File details

Details for the file pybbi-0.4.0-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for pybbi-0.4.0-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 f9b5b4ed4a576aef050ec29bd4a482c13a0fd612cc97175fd1f2785b0ebe1d80
MD5 16379df50a9d9a0f47be9e39f1bdb214
BLAKE2b-256 c2d2c4507e6b84d4f3dbb7e7b50824c021fb309f5f91ab5cf81c3ec4a32facf4

See more details on using hashes here.

File details

Details for the file pybbi-0.4.0-cp311-cp311-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for pybbi-0.4.0-cp311-cp311-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 2f763d0297965ca10d1b7844c83a9a844398c897da3fde1cc03e606cfa5aecc3
MD5 acc0239404ef5f8fe25118d28aa629f5
BLAKE2b-256 d556f1380ebc5fc4abd58ab499f0700a2380f3d6675ae511b6b8bc776bbfe453

See more details on using hashes here.

File details

Details for the file pybbi-0.4.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for pybbi-0.4.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 9d1510879a23c88c9f462278861786d1fe23479bcf86c31216c9d46e1e342d68
MD5 a01d64a7705d5343001297ba34082250
BLAKE2b-256 665f4d281f439d41dfe2a035eb78e3a6957893525c4ce97ff48333a37c72d819

See more details on using hashes here.

File details

Details for the file pybbi-0.4.0-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for pybbi-0.4.0-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 9eee59077696af1a58e7d87210cf354757b305f7b9fea1653d054bd245b382ed
MD5 498ed4b9425f1312cfac553173b121df
BLAKE2b-256 0e618f08b95d0f8c2dadd4f68d1be2d59fa219257c754bfb81220fc7d76b2438

See more details on using hashes here.

File details

Details for the file pybbi-0.4.0-cp310-cp310-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for pybbi-0.4.0-cp310-cp310-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 a3d41cf57abe07cf881d8593388b2f80740797ee9289797cc6c436acb3f4500d
MD5 c3e9b4c8d0e8d6116ddde726f32bc154
BLAKE2b-256 3e5cab984a4ec769ec8fa850c2c22c290f4af7ba081f41eb6be959e942ebfb6a

See more details on using hashes here.

File details

Details for the file pybbi-0.4.0-cp310-cp310-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for pybbi-0.4.0-cp310-cp310-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 b90be7899be17ee67ab976d3092e5f4f60ebfb3cc80350aeacda7fa27340aa2b
MD5 170ade73f087b8a93004b3277830788c
BLAKE2b-256 8b90caa02f2d97cff8707b33510058c3a168a53e73d217db52d15289c13f6315

See more details on using hashes here.

File details

Details for the file pybbi-0.4.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for pybbi-0.4.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 70d1d9ee0649bd73b2c5ee88351d28880189cdfdab1a56e8304d0ae05d03eb51
MD5 8fc48746053803979f59354aa6307bc2
BLAKE2b-256 4ccaae2468fb169825a55a28aa3c9271661116b5310c210a01dbdc0b9518b1bb

See more details on using hashes here.

File details

Details for the file pybbi-0.4.0-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for pybbi-0.4.0-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 08c3f42f1e9632de3b5dc49c75ee630e1382ae2622be45e09485f5d0c8d1496a
MD5 f574286a43366e402396c2639c6f9683
BLAKE2b-256 60b93ee082c5215d2d397db565de99dbc6348fdc4bd3a3e7e15e593eb6c4849f

See more details on using hashes here.

File details

Details for the file pybbi-0.4.0-cp39-cp39-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for pybbi-0.4.0-cp39-cp39-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 64c8a5122eb6841736636c3a84d17183991201d200db405a79fd8da3aa898346
MD5 15c0d2a2fde8ac7deb3cb46ab91f8047
BLAKE2b-256 5cb845d06937a2e5d8e269f23d244a1dfffc789d499fc231f0cfaed12ea10910

See more details on using hashes here.

File details

Details for the file pybbi-0.4.0-cp39-cp39-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for pybbi-0.4.0-cp39-cp39-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 4eb3ccb77ea0f90db8d3c6322e7c79952468ef56d65e43270b6466b37b97b260
MD5 cf910809e6288965a9e1c816675a9ee2
BLAKE2b-256 fd974a606903bfb8af7a80dffcd7d49295998974ae42d3db23e4e2d775888948

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page