Skip to main content

A python library to work with molecules. Built on top of RDKit.

Project description

datamol - molecular processing made easy

Docs | Homepage


DOI Binder PyPI Conda PyPI - Downloads Conda PyPI - Python Version license GitHub Repo stars GitHub Repo stars Codecov

Datamol is a python library to work with molecules. It's a layer built on top of RDKit and aims to be as light as possible.

  • 🐍 Simple pythonic API
  • ⚗️ RDKit first: all you manipulate are rdkit.Chem.Mol objects.
  • ✅ Manipulating molecules often relies on many options; Datamol provides good defaults by design.
  • 🧠 Performance matters: built-in efficient parallelization when possible with an optional progress bar.
  • 🕹️ Modern IO: out-of-the-box support for remote paths using fsspec to read and write multiple formats (sdf, xlsx, csv, etc).

Try Online

Visit Binder and try Datamol online.

Documentation

Visit https://docs.datamol.io.

Installation

Use conda:

mamba install -c conda-forge datamol

Quick API Tour

import datamol as dm

# Common functions
mol = dm.to_mol("O=C(C)Oc1ccccc1C(=O)O", sanitize=True)
fp = dm.to_fp(mol)
selfies = dm.to_selfies(mol)
inchi = dm.to_inchi(mol)

# Standardize and sanitize
mol = dm.to_mol("O=C(C)Oc1ccccc1C(=O)O")
mol = dm.fix_mol(mol)
mol = dm.sanitize_mol(mol)
mol = dm.standardize_mol(mol)

# Dataframe manipulation
df = dm.data.freesolv()
mols = dm.from_df(df)

# 2D viz
legends = [dm.to_smiles(mol) for mol in mols[:10]]
dm.viz.to_image(mols[:10], legends=legends)

# Generate conformers
smiles = "O=C(C)Oc1ccccc1C(=O)O"
mol = dm.to_mol(smiles)
mol_with_conformers = dm.conformers.generate(mol)

# 3D viz (using nglview)
dm.viz.conformers(mol, n_confs=10)

# Compute SASA from conformers
sasa = dm.conformers.sasa(mol_with_conformers)

# Easy IO
mols = dm.read_sdf("s3://my-awesome-data-lake/smiles.sdf", as_df=False)
dm.to_sdf(mols, "gs://data-bucket/smiles.sdf")

How to cite

Please cite Datamol if you use it in your research: DOI.

Compatibilities

Version compatibilities are an essential topic for production-software stacks. We are cautious about documenting compatibility between datamol, python and rdkit.

See below the associated versions of Python and RDKit, for which a minor version of Datamol has been tested during its whole lifecycle. It does not mean other combinations does not work but that those are not tested.

datamol python rdkit
0.12.x [3.10, 3.11] [2023.03, 2023.09]
0.11.x [3.9, 3.10, 3.11] [2022.09, 2023.03]
0.10.x [3.9, 3.10, 3.11] [2022.03, 2022.09]
0.9.x [3.9, 3.10, 3.11] [2022.03, 2022.09]
0.8.x [3.8, 3.9, 3.10] [2021.09, 2022.03, 2022.09]
0.7.x [3.8, 3.9] [2021.09, 2022.03]
0.6.x [3.8, 3.9] [2021.09]
0.5.x [3.8, 3.9] [2021.03, 2021.09]
0.4.x [3.8, 3.9] [2020.09, 2021.03]
0.3.x [3.8, 3.9] [2020.09, 2021.03]

CI Status

The CI runs tests and performs code quality checks for the following combinations:

  • The three major platforms: Windows, OSX and Linux.
  • The two latest Python versions.
  • The two latest RDKit versions.
main
Lib build & Testing test
Code Sanity (linting and type analysis) code-check
Documentation Build doc

License

Under the Apache-2.0 license. See LICENSE.

Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

datamol-0.12.3.tar.gz (3.9 MB view details)

Uploaded Source

Built Distribution

datamol-0.12.3-py3-none-any.whl (494.7 kB view details)

Uploaded Python 3

File details

Details for the file datamol-0.12.3.tar.gz.

File metadata

  • Download URL: datamol-0.12.3.tar.gz
  • Upload date:
  • Size: 3.9 MB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.7

File hashes

Hashes for datamol-0.12.3.tar.gz
Algorithm Hash digest
SHA256 3411403fb889d381f91f75d5ea074ce9e9ac789aac2424abb547c6066193b38b
MD5 2f9d2fec58c7808628dc08d5e11e413b
BLAKE2b-256 4cefdcca7fd4fd7b2717e9506e15a74b745a8bd6e1e46d25dfa5c5177012138a

See more details on using hashes here.

File details

Details for the file datamol-0.12.3-py3-none-any.whl.

File metadata

  • Download URL: datamol-0.12.3-py3-none-any.whl
  • Upload date:
  • Size: 494.7 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.7

File hashes

Hashes for datamol-0.12.3-py3-none-any.whl
Algorithm Hash digest
SHA256 aacbb273729cfa4cd887baa56b61bd7dcf1f6b77facccb2b416b40b2dc8c5b78
MD5 848d4dcd72f8f4ca208ff208fcda9a24
BLAKE2b-256 a0523e12decb3486bb90f0fffea6bba9a6f50d0ea9d3eeb08c86450e29338099

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page