Skip to main content

A python library to work with molecules. Built on top of RDKit.

Project description

Molecular Manipulation Made Easy


Binder PyPI Conda PyPI - Downloads Conda PyPI - Python Version license GitHub Repo stars GitHub Repo stars

Datamol is a python library to work with molecules. It's a layer built on top of RDKit and aims to be as light as possible.

  • 🐍 Simple pythonic API
  • ⚗️ RDKit first: all you manipulate are rdkit.Chem.Mol objects.
  • ✅ Manipulating molecules often rely on many options; Datamol provides good defaults by design.
  • 🧠 Performance matters: built-in efficient parallelization when possible with optional progress bar.
  • 🕹️ Modern IO: out-of-the-box support for remote paths using fsspec to read and write multiple formats (sdf, xlsx, csv, etc).

Try Online

Visit Binder and try Datamol online.

Documentation

Visit https://doc.datamol.io.

Installation

Use conda:

mamba install -c conda-forge datamol

Quick API Tour

import datamol as dm

# Common functions
mol = dm.to_mol("O=C(C)Oc1ccccc1C(=O)O", sanitize=True)
fp = dm.to_fp(mol)
selfies = dm.to_selfies(mol)
inchi = dm.to_inchi(mol)

# Standardize and sanitize
mol = dm.to_mol("O=C(C)Oc1ccccc1C(=O)O")
mol = dm.fix_mol(mol)
mol = dm.sanitize_mol(mol)
mol = dm.standardize_mol(mol)

# Dataframe manipulation
df = dm.data.freesolv()
mols = dm.from_df(df)

# 2D viz
legends = [dm.to_smiles(mol) for mol in mols[:10]]
dm.viz.to_image(mols[:10], legends=legends)

# Generate conformers
smiles = "O=C(C)Oc1ccccc1C(=O)O"
mol = dm.to_mol(smiles)
mol_with_conformers = dm.conformers.generate(mol)

# 3D viz (using nglview)
dm.viz.conformers(mol, n_confs=10)

# Compute SASA from conformers
sasa = dm.conformers.sasa(mol_with_conformers)

# Easy IO
mols = dm.read_sdf("s3://my-awesome-data-lake/smiles.sdf", as_df=False)
dm.to_sdf(mols, "gs://data-bucket/smiles.sdf")

Compatibilities

Version compatibilities are an essential topic for production-software stacks. We are cautious about documenting compatibility between datamol, python and rdkit.

datamol python rdkit
0.3 >=3.7,<=3.9 >=2020.09,<=2021.03

CI Status

master
Lib build & Testing GitHub Workflow Status
Code Sanity (linting and type analysis) GitHub Workflow Status
Documentation Build GitHub Workflow Status

Changelogs

See the latest changelogs at CHANGELOG.rst.

License

Under the Apache-2.0 license. See LICENSE.

Authors

See AUTHORS.rst.

Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

datamol-0.3.7.tar.gz (82.0 kB view details)

Uploaded Source

File details

Details for the file datamol-0.3.7.tar.gz.

File metadata

  • Download URL: datamol-0.3.7.tar.gz
  • Upload date:
  • Size: 82.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/3.10.1 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.60.0 CPython/3.9.4

File hashes

Hashes for datamol-0.3.7.tar.gz
Algorithm Hash digest
SHA256 4cbe228d67d3d3e4aac5825682b2c2e7a6146e99c68984de740384a0d131a744
MD5 d0b0f473521b4b7e9cbc7ae72b21e362
BLAKE2b-256 27f7f76dcddd78e8a4e024f06ca6ad02c43de6aeb44d4422387194d246bad04d

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page