Skip to main content

Python bindings to the UCSC Big Binary (bigWig/bigBed) file library.

Project description

pybbi

Build Status

Python interface to Jim Kent's Big Binary Indexed file (BBI) [1] library from the UCSC Genome Browser source tree using Cython.

This provides read-level access to local and remote bigWig and bigBed files but no write capabilitites. The main feature is fast retrieval of range queries into numpy arrays.

Installation

Wheels for pybbi are available on PyPI for Python 3.8, 3.9, 3.10, 3.11 on Linux (x86_64 and aarch64) and Mac OSX (x86_64/Intel). Apple Silicon (arm64) wheels will be made available once M1 runners are available in GitHub Actions.

$ pip install pybbi

API

The bbi.open function returns a BBIFile object.

bbi.open(path) -> BBIFile

path can be a local file path (bigWig or bigBed) or a URL. BBIFile objects are context managers and can be used in a with statement to clean up resources without calling BBIFile.close().

>>> with bbi.open('bigWigExample.bw') as f:
...     x = f.fetch('chr21', 1000000, 2000000, bins=40)

Introspection

BBIFile.is_bigwig -> bool
BBIFile.is_bigbed -> bool
BBIFile.chromsizes -> OrderedDict
BBIFile.zooms -> list
BBIFile.info -> dict
BBIFile.schema -> dict
BBIFile.read_autosql() -> str

Note: BBIFile.schema['dtypes'] provides numpy data types for the fields in a bigWig or bigBed (matched from the autoSql definition).

Interval output

The actual intervals in a bigWig or bigBed can be retrieved as a pandas dataframe or as an iterator over records as tuples. The pandas output is parsed according to the file's schema.

BBIFile.fetch_intervals(chrom, start, end) -> pandas.DataFrame
BBIFile.fetch_intervals(chrom, start, end, iterator=True) -> interval iterator

Array output

Retrieve quantitative signal as an array. The signal of a bigWig file is obtained from its "value" field. The signal of a bigBed file is obtained from the genomic coverage of its intervals.

For a single range query:

BBIFile.fetch(chrom, start, end, [bins [, missing [, oob, [, summary]]]]) -> 1D numpy array

To produce a stacked heatmap from a list of (1) equal-length intervals or (2) arbitrary-length intervals with bins specified:

BBIFile.stackup(chroms, starts, ends, [bins [, missing [, oob, [, summary]]]]) -> 2D numpy array
  • Summary querying is supported by specifying the number of bins for coarsening. The summary statistic can be one of: 'mean', 'min', 'max', 'cov', 'std', 'or 'sum'. (default = 'mean'). Intervals need not have the same length, in which case the data from each interval will be interpolated to the same number of bins (e.g., gene bodies).

  • Missing data can be filled with a custom fill value, missing (default = 0).

  • Out-of-bounds ranges (i.e. start less than zero or end greater than the chromosome length) are permitted because of their utility e.g., for generating vertical heatmap stacks centered at specific genomic features. A separate custom fill value, oob can be provided for out-of-bounds positions (default = NaN).

Function API

The original function-based API is still available:

bbi.is_bbi(path: str) -> bool
bbi.is_bigwig(path: str) -> bool
bbi.is_bigbed(path:str) -> bool
bbi.chromsizes(path: str) -> OrderedDict
bbi.zooms(path: str) -> list
bbi.info(path: str) -> dict
bbi.fetch_intervals(path: str, chrom: str, start: int, end: int, iterator: bool) -> Union[Iterable, pd.DataFrame]
bbi.fetch(path: str, chrom: str, start: int, end: int, [bins: int [, missing: float [, oob: float, [, summary: str]]]]) -> np.array[1, 'float64']
bbi.stackup(path: str, chroms: np.array, starts: np.array, ends: np.array, [bins: int [, missing: float [, oob: float, [, summary: str]]]]) -> np.array[2, 'float64']

See the docstrings for complete documentation.

Related projects

  • libBigWig: Alternative C library for bigWig and bigBed files by Devon Ryan
  • pyBigWig: Python bindings for libBigWig by the same author
  • bw-python: Alternative Python wrapper to libBigWig by Brent Pederson
  • bx-python: Python bioinformatics library from James Taylor's group that includes tools for bbi files.

This library provides bindings to the reference UCSC bbi library code. Check out @dpryan79's libBigWig for an alternative and dedicated C library for big binary files. pyBigWig also provides numpy-based retrieval and bigBed support.

References

[1]: http://bioinformatics.oxfordjournals.org/content/26/17/2204.full

From source

If wheels for your platform or Python version aren't available or you want to develop, you'll need to install pybbi from source. The source distribution on PyPI ships with (slightly modified) kent utils source, which will compile before the extension module is built.

Requires

  • Platform: Linux or Darwin (Windows Subsystem for Linux seems to work too)
  • pthreads, zlib, libpng, openssl, make, pkg-config
  • Python 3.6+
  • numpy and cython

For example, on a fresh Ubuntu instance, you'll need build-essential, make, pkg-config, zlib1g-dev, libssl-dev, libpng16-dev.

On a Centos/RedHat (rpm) system you'll need gcc, make, pkg-config, zlib-devel, openssl-devel, libpng-devel.

On a Mac, you'll need Xcode and to brew install pkg-config openssl libpng.

For development, clone the repo and install in editable mode:

$ git clone https://github.com/nvictus/pybbi.git
$ cd pybbi
$ pip install -e .

You can use the ARCH environment variable to specify a target architecture or ARCHFLAGS on a Mac.

Notes

Unfortunately, Kent's C source is not well-behaved library code, as it is littered with error calls that call exit(). pybbi will catch and pre-empt common input errors, but if somehow an internal error does get raised, it will terminate your interpreter instance.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pybbi-0.3.6.tar.gz (34.7 MB view details)

Uploaded Source

Built Distributions

pybbi-0.3.6-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (3.4 MB view details)

Uploaded CPython 3.11 manylinux: glibc 2.17+ x86-64

pybbi-0.3.6-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (3.0 MB view details)

Uploaded CPython 3.11 manylinux: glibc 2.17+ ARM64

pybbi-0.3.6-cp311-cp311-macosx_10_9_x86_64.whl (2.7 MB view details)

Uploaded CPython 3.11 macOS 10.9+ x86-64

pybbi-0.3.6-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (3.3 MB view details)

Uploaded CPython 3.10 manylinux: glibc 2.17+ x86-64

pybbi-0.3.6-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (3.0 MB view details)

Uploaded CPython 3.10 manylinux: glibc 2.17+ ARM64

pybbi-0.3.6-cp310-cp310-macosx_10_9_x86_64.whl (2.7 MB view details)

Uploaded CPython 3.10 macOS 10.9+ x86-64

pybbi-0.3.6-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (3.3 MB view details)

Uploaded CPython 3.9 manylinux: glibc 2.17+ x86-64

pybbi-0.3.6-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (3.0 MB view details)

Uploaded CPython 3.9 manylinux: glibc 2.17+ ARM64

pybbi-0.3.6-cp39-cp39-macosx_10_9_x86_64.whl (2.7 MB view details)

Uploaded CPython 3.9 macOS 10.9+ x86-64

pybbi-0.3.6-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (3.3 MB view details)

Uploaded CPython 3.8 manylinux: glibc 2.17+ x86-64

pybbi-0.3.6-cp38-cp38-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (3.0 MB view details)

Uploaded CPython 3.8 manylinux: glibc 2.17+ ARM64

pybbi-0.3.6-cp38-cp38-macosx_10_9_x86_64.whl (2.7 MB view details)

Uploaded CPython 3.8 macOS 10.9+ x86-64

File details

Details for the file pybbi-0.3.6.tar.gz.

File metadata

  • Download URL: pybbi-0.3.6.tar.gz
  • Upload date:
  • Size: 34.7 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.18

File hashes

Hashes for pybbi-0.3.6.tar.gz
Algorithm Hash digest
SHA256 1733b49a86c56848bbdc286b653da7ecfdeb72164516bfac10e54f0f1d49fc2a
MD5 942f99b67d6279d7fab1cd95c5b4268f
BLAKE2b-256 87158980285084180cbb605a0cb00eab96f297dd3531104af0e1fa4a2d8a9ade

See more details on using hashes here.

File details

Details for the file pybbi-0.3.6-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for pybbi-0.3.6-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 2e80cc5ffeac6210cca611eda367b1669686ba7a1435704411df14aa1b04aec8
MD5 b839c281d077bb9cf4a7c086a1a36127
BLAKE2b-256 643ef336a2d289f1816508a238b27872b08c7ee779d5ae666a2101b688ac9a3f

See more details on using hashes here.

File details

Details for the file pybbi-0.3.6-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for pybbi-0.3.6-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 c007b76b05fd1b7e088f37a30521bfef9e1ef504098df01e1cab5752c06814c8
MD5 7891868951853ad7f8591d782d3996df
BLAKE2b-256 d56713a7dc17230b89aee4f6039068cc40c93a39730bef4dd9eb7f3a98d83107

See more details on using hashes here.

File details

Details for the file pybbi-0.3.6-cp311-cp311-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for pybbi-0.3.6-cp311-cp311-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 24a264f9ca56a831a6db8111ee50590baba03e55b8d4c9eddd5f470f0903b648
MD5 760257eaa21d3eb089963954ae389fc9
BLAKE2b-256 9ade399d8f21f5554438f24a4d94f51b6a990665ac55b0fc9be5a1ce837f663a

See more details on using hashes here.

File details

Details for the file pybbi-0.3.6-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for pybbi-0.3.6-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 6d4c2be0a6a1f49e241c9fb3ab38c3f2dc69bf24ee2beab1d49a8b74474da4fe
MD5 388505f0874359fedd2bd05e9ee7de9f
BLAKE2b-256 24cdcb440824935e230b4fa27315faf7e74b75f75e81dc67d44dc159a4edcfdd

See more details on using hashes here.

File details

Details for the file pybbi-0.3.6-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for pybbi-0.3.6-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 bafd922d6d2e976fe17dc11cf61b6722037a542be19f01d4c21f5f12d8ea5f8e
MD5 35f86e74dc54d004bc0e7a97eb95a145
BLAKE2b-256 8ecaac47c205830238c037fe725c7cafa3a8013d23ea763c586749a32c03f4d4

See more details on using hashes here.

File details

Details for the file pybbi-0.3.6-cp310-cp310-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for pybbi-0.3.6-cp310-cp310-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 e71f928b0d9edc5c7e2c885e197f16848b7192a3f83abdd1e60b3d40ebe63a03
MD5 2eebd3f9781d98bacfd352a0678faaf8
BLAKE2b-256 4660d4d21fe66ee4b0dd33e03db897e555d1d336be1de75f966d236979697487

See more details on using hashes here.

File details

Details for the file pybbi-0.3.6-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for pybbi-0.3.6-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 5701845e3fb34dd3ab9a39471e3412d6fb7192f763637359e030621f1d793efa
MD5 4259f57fd4e4e434204e8fcbf01befb0
BLAKE2b-256 000831371410d321897da1c9970efcf60fbf1421c1664619eacba09f810965ce

See more details on using hashes here.

File details

Details for the file pybbi-0.3.6-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for pybbi-0.3.6-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 7a49ca258651a6570c3abd9d2a7f3fd4bcf9504503e5f370428098c8d61a71b2
MD5 58f785da23f296120c0b15745180c91a
BLAKE2b-256 aebd1cf77f68c45776c01d05380d7bf7e3f5b341f0552c8bb99fa15565f78922

See more details on using hashes here.

File details

Details for the file pybbi-0.3.6-cp39-cp39-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for pybbi-0.3.6-cp39-cp39-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 ab71111fcda85f447d8805899bec6f4f8dd1fae9a0918064f40e48820af8af56
MD5 89938c5541a5ee7122f595b7de2ccddf
BLAKE2b-256 6beb7b4777ae5cab1e7618fe4538db38b6d0cd936ad38cdb308d1dfe082a7e2b

See more details on using hashes here.

File details

Details for the file pybbi-0.3.6-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for pybbi-0.3.6-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 cccaa0fa08fe6486745d9ca5a26d896074afffed2944d1fbf34534a7d93a7322
MD5 b029bc253ecab47b7fa3184c86dbc8f2
BLAKE2b-256 15c148ddaa0447444ac14334287a641f1e5224e456bf1ce23dd96915c09b954b

See more details on using hashes here.

File details

Details for the file pybbi-0.3.6-cp38-cp38-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for pybbi-0.3.6-cp38-cp38-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 689c7bed72ebeabaa14b9b930b42422c45e036efceafdfa91249c18ef50f6a0b
MD5 e38d11da7727e31914d40a874ace6750
BLAKE2b-256 b868d4c58e0f79f61ac8a4d7f216cc861fec675e0e11848182af5a92389c4ba6

See more details on using hashes here.

File details

Details for the file pybbi-0.3.6-cp38-cp38-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for pybbi-0.3.6-cp38-cp38-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 861d87583419163d1b0af34df6d79ed054bd30c9d99c4b4ba6d965895351eae7
MD5 235267ffa427ab6133a8c85645fb98f4
BLAKE2b-256 1c24688416f8bf678be8dadab2e3533e17c7701792c22b007ea0081d31fe4a49

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page