A python library to work with molecules. Built on top of RDKit.
Project description
Molecular Manipulation Made Easy
Datamol is a python library to work with molecules. It's a layer built on top of RDKit and aims to be as light as possible.
- 🐍 Simple pythonic API
- ⚗️ RDKit first: all you manipulate are
rdkit.Chem.Mol
objects. - ✅ Manipulating molecules often rely on many options; Datamol provides good defaults by design.
- 🧠 Performance matters: built-in efficient parallelization when possible with optional progress bar.
- 🕹️ Modern IO: out-of-the-box support for remote paths using
fsspec
to read and write multiple formats (sdf, xlsx, csv, etc).
Try Online
Documentation
Visit https://doc.datamol.io.
Installation
Use conda:
mamba install -c conda-forge datamol
Quick API Tour
import datamol as dm
# Common functions
mol = dm.to_mol("O=C(C)Oc1ccccc1C(=O)O", sanitize=True)
fp = dm.to_fp(mol)
selfies = dm.to_selfies(mol)
inchi = dm.to_inchi(mol)
# Standardize and sanitize
mol = dm.to_mol("O=C(C)Oc1ccccc1C(=O)O")
mol = dm.fix_mol(mol)
mol = dm.sanitize_mol(mol)
mol = dm.standardize_mol(mol)
# Dataframe manipulation
df = dm.data.freesolv()
mols = dm.from_df(df)
# 2D viz
legends = [dm.to_smiles(mol) for mol in mols[:10]]
dm.viz.to_image(mols[:10], legends=legends)
# Generate conformers
smiles = "O=C(C)Oc1ccccc1C(=O)O"
mol = dm.to_mol(smiles)
mol_with_conformers = dm.conformers.generate(mol)
# 3D viz (using nglview)
dm.viz.conformers(mol, n_confs=10)
# Compute SASA from conformers
sasa = dm.conformers.sasa(mol_with_conformers)
# Easy IO
mols = dm.read_sdf("s3://my-awesome-data-lake/smiles.sdf", as_df=False)
dm.to_sdf(mols, "gs://data-bucket/smiles.sdf")
Compatibilities
Version compatibilities are an essential topic for production-software stacks. We are cautious about documenting compatibility between datamol
, python
and rdkit
.
datamol |
python |
rdkit |
---|---|---|
0.3 |
>=3.7,<=3.9 |
>=2020.09,<=2021.03 |
CI Status
master |
|
---|---|
Lib build & Testing | |
Code Sanity (linting and type analysis) | |
Documentation Build |
Changelogs
See the latest changelogs at CHANGELOG.rst.
License
Under the Apache-2.0 license. See LICENSE.
Authors
See AUTHORS.rst.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
datamol-0.4.0.tar.gz
(87.2 kB
view details)
File details
Details for the file datamol-0.4.0.tar.gz
.
File metadata
- Download URL: datamol-0.4.0.tar.gz
- Upload date:
- Size: 87.2 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.4.1 importlib_metadata/4.6.1 pkginfo/1.7.1 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.61.2 CPython/3.8.10
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 44976010b956bc4989bcbd1e363d14fabf9028ba92f08048864b0309368751f9 |
|
MD5 | 848401f6f0ba2c112f6bdc03adb0b806 |
|
BLAKE2b-256 | 62e6f36cc5ef17d693944ffab08f704aa1c3203671dead053576fbad5aab6ca3 |