Skip to main content

Miscelanelous python-based bioinformatics utils

Project description

blindschleiche

A collection of bioinformatics / sequence utilities needed for my research, and hopefully useful for yours.

DOI doi:10.5281/zenodo.10049825

Install

pip install blindschleiche
# or for the current main branch:
# pip install git+https://github.com/kdm9/blindschleiche.git

Usage

USAGE: blsl <subtool> [options...]


Where <subtool> is one of:

  telogrep:             Search contigs for known telomere repeats
  n50:                  Calculate N50 and total length of a set of contigs
  falen:                Tabulate the lengths of sequences in a FASTA file
  mask2bed:             The inverse of bedtools maskfasta: softmasked fasta -> unmasked fasta + mask.bed
  pansn-rename:         Add, remove, or modify PanSN-style prefixes to contig/chromosome names in references
  genigvjs:             Generate a simple IGV.js visualisation of some bioinf files.
  ildemux:              Demultiplex modern illumina reads from read headers.
  ilsample:             Sample a fraction of read pairs from an interleaved fastq file
  regionbed:            Make a bed/region file of genome windows
  uniref-acc2taxid:     Make a ncbi-style acc2taxid.map file for a uniref fasta
  nstitch:              Combine R1 + R2 into single sequences, with an N in the middle
  gg2k:                 Summarise a table with GreenGenes-style lineages into a kraken-style report.
  equalbestblast:       Output only the best blast hits.
  tabcat:               Concatenate table (c/tsv) files, adding the filename as a column
  esearchandfetch:      Use the Entrez API to search for and download something. A CLI companion to the NCBI search box
  deepclust2fa:         Split a .faa by the clusters diamond deepclust finds
  farename:             Rename sequences in a fasta file sequentially
  gffcat:               Concatenate GFF3 files, resepcting header lines and FASTA sections
  gffparse:             Format a GFF sanely
  gffcsqify:            Format a reasonably compliant GFF for use with bcftools csq
  gfftagsane:           Sanitise a messy gff attribute column to just simple tags 
  liftoff-gff3:         Obtain an actually-useful GFF3 from Liftoff by fixing basic GFF3 format errors
  ebiosra2rl2s:         INTERNAL: MPI Tübingen tool. Make a runlib-to-sample map table from ebio sra files
  galhist:              Make a summary histogram of git-annex-list output
  pairslash:            Add an old-style /1 /2 pair indicator to paired-end fastq files
  vcfstats:             Use bcftools to calculate various statistics, outputing an R-ready table
  vcfparallel:          Parallelise a bcf processing pipeline across regions
  shannon-entropy:      Calculate Shannon's entropy (in bits) at each column of one or more alignments
  fastasanitiser:       Sanitise fasta IDs to something sane, then back again
  tidyqc:               What if MultiQC was in the tidyverse? (and much worse)
  jsonl2csv:            Parse jsonlines into a C/TSV
  help:                 Print this help message


Use blsl subtool --help to get help about a specific tool

Why the name Blindschleiche?

  1. They're awesome animals
  2. Their English name is Slow Worm, which is appropriate for this set of low-performance tools in Python.
  3. All tools implemented in Python must be named with a snake pun, and they're kinda a snake (not really, they're legless lizards)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

blindschleiche-0.3.0.tar.gz (32.9 kB view details)

Uploaded Source

Built Distribution

blindschleiche-0.3.0-py3-none-any.whl (46.8 kB view details)

Uploaded Python 3

File details

Details for the file blindschleiche-0.3.0.tar.gz.

File metadata

  • Download URL: blindschleiche-0.3.0.tar.gz
  • Upload date:
  • Size: 32.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.1.1 CPython/3.9.19

File hashes

Hashes for blindschleiche-0.3.0.tar.gz
Algorithm Hash digest
SHA256 0c3b714035f969dade2a6e72759e207814cb1ab12de89626bc4036d6d217e2a9
MD5 ef51af4bb48f60566c2cbdff556deb3d
BLAKE2b-256 9871b412c0d3ba5cce4952bc25f403324535c6c3f078c2e1a88835ba308e38fa

See more details on using hashes here.

File details

Details for the file blindschleiche-0.3.0-py3-none-any.whl.

File metadata

File hashes

Hashes for blindschleiche-0.3.0-py3-none-any.whl
Algorithm Hash digest
SHA256 b18cb419996e66e71d8900dc69bfb35342cb4e68e9250fe318c1ce4e6d1a248b
MD5 7052b04b2a09c7f11324e664ad8c596e
BLAKE2b-256 f16371821c5605c5bd91b6b1f00ab6d0022c4c2dd8530f66e36e1d1b73c73944

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page