Skip to main content

UNKNOWN

Project description

vcfnp
=====

Load numpy arrays from a VCF (variant call file).

Installation
------------

Installation requires numpy and cython.

```
$ pip install vcfnp
```

...or:

```
$ git clone --recursive git://github.com/alimanfoo/vcfnp.git
$ cd vcfnp
$ python setup.py build_ext --inplace
```

Usage
-----

```python
import sys
import vcfnp
import numpy as np
import matplotlib.pyplot as plt

filename = '/path/to/my.vcf'

# load data from fixed fields (except INFO)
v = vcfnp.variants(filename).view(np.recarray)

# print some simple variant metrics
print 'found %s variants (%s SNPs)' % (v.size, np.count_nonzero(v.is_snp))
print 'QUAL mean (std): %s (%s)' % (np.mean(v.QUAL), np.std(v.QUAL))

# load data from INFO field
i = vcfnp.info(filename).view(np.recarray)

# plot a histogram of variant depth
fig = plt.figure(1)
ax = fig.add_subplot(111)
ax.hist(i.DP)
ax.set_title('DP histogram')
ax.set_xlabel('DP')
plt.show()

# load data from sample columns
c = vcfnp.calldata(filename).view(np.recarray)
c = vcfnp.view2d(c)

# print some simple genotype metrics
count_phased = np.count_nonzero(c.is_phased)
count_variant = np.count_nonzero(np.any(c.genotype > 0, axis=2))
count_missing = np.count_nonzero(~c.is_called)
print 'calls (phased, variant, missing): %s (%s, %s, %s)' % (c.flatten().size, count_phased, count_variant, count_missing)

# plot a histogram of genotype quality
fig = plt.figure(2)
ax = fig.add_subplot(111)
ax.hist(c.GQ.flatten())
ax.set_title('GQ histogram')
ax.set_xlabel('GQ')
plt.show()
```

Acknowledgments
---------------

Based on the https://github.com/ekg/vcflib library.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

vcfnp-0.3.tar.gz (392.3 kB view details)

Uploaded Source

File details

Details for the file vcfnp-0.3.tar.gz.

File metadata

  • Download URL: vcfnp-0.3.tar.gz
  • Upload date:
  • Size: 392.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for vcfnp-0.3.tar.gz
Algorithm Hash digest
SHA256 11cafca08d8651815d12d770755750f9f92c9cc2669a769d42fad57b3c2543d4
MD5 57253b22c897b3d485545e276a7909c1
BLAKE2b-256 b9c006a42e628186d23abca1be3ae375914afde2d61e51b544335697a47f92a0

See more details on using hashes here.

Provenance

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page