Skip to main content

tools to support genome and metagenome analysis

Project description

genome-grist - map Illumina metagenomes to GenBank genomes

PyPI License: 3-Clause BSD

  1. download a metagenome
  2. process it into trimmed reads, and make a sourmash signature
  3. search the sourmash signature with 'gather' against sourmash databases, e.g. all of genbank
  4. download the matching genomes from genbank
  5. map all metagenome reads to genomes using minimap - map_reads and extract_mapped_reads
  6. extract matching reads iteratively based on gather, successively eliminating reads that matched to previous gather matches - extract_gather
  7. run mapping on “leftover” reads to genomes - map_gather
  8. summarize all mapping results for comparison and graphing - summarize_gather

Why the name grist?

In the sourmash family of names (sourmash, wort, distillerycats, etc.)

NOT: https://en.wikipedia.org/wiki/Grist_(computing)

THIS: https://en.wikipedia.org/wiki/Grist

Leftover text

podar ref genomes

Snakefile based on @luizirber code

Genome URL generation code

download SRA code

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

genome-grist-0.1.1.tar.gz (314.6 kB view details)

Uploaded Source

File details

Details for the file genome-grist-0.1.1.tar.gz.

File metadata

  • Download URL: genome-grist-0.1.1.tar.gz
  • Upload date:
  • Size: 314.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.6.0 requests/2.24.0 setuptools/50.3.2 requests-toolbelt/0.9.1 tqdm/4.50.2 CPython/3.7.6

File hashes

Hashes for genome-grist-0.1.1.tar.gz
Algorithm Hash digest
SHA256 c299a3c6e8c1b291f7d562bf1ad931eeb97606773edd1d8f194fd29a19a7a15d
MD5 80ba92750cb2fd2f6924d06cb51bc786
BLAKE2b-256 af596472fdb3d0efbfb26c4d07eaf249a70c43e12ec1dbe811275a5eaf90aef4

See more details on using hashes here.

Provenance

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page