Skip to main content

No project description provided

Project description

SeqLike

SeqLike - flexible biological sequence objects in Python

PyPI - Supported Python Version PyPI - Package Version Conda - Platform Conda (channel only) Docs - GitHub.io

Introduction

A single object API that makes working with biological sequences in Python more ergonomic. It'll handle anything like a sequence.

Built around the Biopython SeqRecord class, SeqLikes abstract over the semantics of molecular biology (DNA -> RNA -> AA) and data structures (strings, Seqs, SeqRecords, numerical encodings) to allow manipulation of a biological sequence at the level which is most computationally convenient.

Code samples and examples

Build data-type agnostic functions

def f(seq: SeqLikeType, *args):
	seq = SeqLike(seq, seq_type="nt").to_seqrecord()
	# ...

Streamline conversion to/from ML friendly representations

prediction = model(aaSeqLike('MSKGEELFTG').to_onehot())
new_seq = ntSeqLike(generative_model.sample(), alphabet="-ACGTUN")

Interconvert between AA and NT forms of a sequence

Back-translation is conveniently built-in!

s_nt = ntSeqLike("ATGTCTAAAGGTGAA")
s_nt[0:3] # ATG
s_nt.aa()[0:3] # MSK, nt->aa is well defined
s_nt.aa()[0:3].nt() # ATGTCTAAA, works because SeqLike now has both reps
s_nt[:-1].aa() # TypeError, len(s_nt) not a multiple of 3

s_aa = aaSeqLike("MSKGE")
s_aa.nt() # AttributeError, aa->nt is undefined w/o codon map
s_aa = aaSeqLike(s_aa, codon_map=random_codon_map)
s_aa.nt() # now works, backtranslated to e.g. ATGTCTAAAGGTGAA
s_aa[:1].nt() # ATG, codon_map is maintained

Easily plot multiple sequence alignments

seqs = [s for s in SeqIO.parse("file.fasta", "fasta")]
df = pd.DataFrame(
    {
        "names": [s.name for s in seqs],
        "seqs": [aaSeqLike(s) for s in seqs],
    }
)
df["aligned"] = df["seqs"].seq.align()
df["aligned"].seq.plot()

Flexibly build and parse numerical sequence representations

# Assume you have a dataframe with a column of 10 SeqLikes of length 90
df["seqs"].seq.to_onehot().shape # (10, 90, 23), padded if needed

To see more in action, please check out the docs!

Getting Started

Install the library with pip or conda.

With pip

pip install seqlike

With conda

conda install -c conda-forge seqlike

Authors

Support

Contributors ✨

Thanks goes to these wonderful people (emoji key):

Nasos Dousis
Nasos Dousis

💻
andrew giessel
andrew giessel

💻
Max Wall
Max Wall

💻 📖
Eric Ma
Eric Ma

💻 📖
Mihir Metkar
Mihir Metkar

🤔 💻
Marcus Caron
Marcus Caron

📖
pagpires
pagpires

📖
Sugato Ray
Sugato Ray

🚇 🚧
Damien Farrell
Damien Farrell

💻
Farbod Mahmoudinobar
Farbod Mahmoudinobar

💻
Jacob Hayes
Jacob Hayes

🚇

This project follows the all-contributors specification. Contributions of any kind welcome!

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

seqlike-1.5.2.tar.gz (374.2 kB view details)

Uploaded Source

Built Distribution

seqlike-1.5.2-py3-none-any.whl (378.1 kB view details)

Uploaded Python 3

File details

Details for the file seqlike-1.5.2.tar.gz.

File metadata

  • Download URL: seqlike-1.5.2.tar.gz
  • Upload date:
  • Size: 374.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.17

File hashes

Hashes for seqlike-1.5.2.tar.gz
Algorithm Hash digest
SHA256 d67f7b1fcb054d3530898c5af796401ccc853c3a566a178f2c4d7668d33fe83a
MD5 f0d5005c8bf93ae7bb79e88fe10d4e2d
BLAKE2b-256 457fffaba541ae45067f33b3f04524b9ac0234c99e72081b9a29a008b9161cd9

See more details on using hashes here.

File details

Details for the file seqlike-1.5.2-py3-none-any.whl.

File metadata

  • Download URL: seqlike-1.5.2-py3-none-any.whl
  • Upload date:
  • Size: 378.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.9.17

File hashes

Hashes for seqlike-1.5.2-py3-none-any.whl
Algorithm Hash digest
SHA256 fc09c1c05731fc99aa2da406f6ee709e5da4a919e6cc4a1042bcbeb030eac6ef
MD5 03f6c6f1167441291d3d8e6b3fa9f671
BLAKE2b-256 0ded8f0a167af28dd8bf6afd126ac882ce43f9054514c375a4667e6a56a32187

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page