LaminDB: Manage R&D data & analyses.
Project description
LaminDB: Manage R&D data & analyses
Curate, store, track, query, integrate, and learn from biological data.
LaminDB is an open-source data lake for R&D in biology. It manages indexed object storage (local directories, S3, GCP) with a mapped SQL database (SQLite, Postgres, and soon, BigQuery).
One cool thing is that you can readily create distributed LaminDB instances at any scale. Get started on your laptop, deploy in the cloud, or work with a mesh of instances for different teams and purposes.
Public beta: Currently only recommended for collaborators as we still make breaking changes.
Installation
LaminDB is a python package available for Python versions 3.8+.
pip install lamindb
Import
In your python script, import LaminDB as:
import lamindb as ln
Quick setup
Quick setup on the command line:
- Sign up via
lamin signup <email>
- Log in via
lamin login <handle>
- Set up an instance via
lamin init --storage <storage> --schema <schema_modules>
:::{dropdown} Example code
lamin signup testuser1@lamin.ai
lamin login testuser1
lamin init --storage ./mydata --schema bionty,wetlab
:::
See {doc}/guide/setup
for more.
Track & query data
Track data source & data
::::{tab-set} :::{tab-item} Within a notebook
ln.nb.header() # data source is created and linked
df = pd.DataFrame({"a": [1, 2], "b": [3, 4]})
# create a data object with SQL metadata record
dobject = ln.DObject(df, name="My dataframe")
# upload the data file to the configured storage
# and commit a DObject record to the SQL database
ln.add(dobject)
::: :::{tab-item} Within a pipeline
# create a pipeline record
pipeline = lns.Pipeline(name="my pipeline", version="1")
# create a run from the above pipeline as the data source
run = lns.Run(pipeline=pipeline, name="my run")
df = pd.DataFrame({"a": [1, 2], "b": [3, 4]})
# create a data object with SQL metadata record
dobject = ln.DObject(df, name="My dataframe", source=run)
# upload the data file to the configured storage
# and commit a DObject record to the SQL database
ln.add(dobject)
::: ::::
Query & load data
dobject = ln.select(ln.DObject, name="My dataframe").one()
df = dobject.load()
See {doc}/guide/ingest
for more.
Track biological features
import bionty as bt
# An sample single cell RNA-seq dataset
adata = ln.dev.datasets.anndata_mouse_sc_lymph_node()
# Start to track genes mapped to a Bionty Entity
# - ensembl id as the standardized id
# - mouse as the species
reference = bt.Gene(id=bt.gene_id.ensembl_gene_id, species=bt.Species().lookup.mouse)
# Create a data object with features
dobject = ln.DObject(adata, name="Mouse Lymph Node scRNA-seq", features_ref=reference)
# upload the data file to the configured storage
# and commit a DObject record to the sql database
ln.add(dobject)
See {doc}/guide/link-features
for more.
- Each page in this guide is a Jupyter Notebook, which you can download [here](https://github.com/laminlabs/lamindb/tree/main/docs/guide).
- You can run these notebooks in hosted versions of JupyterLab, e.g., [Saturn Cloud](https://github.com/laminlabs/run-lamin-on-saturn), Google Vertex AI, and others.
- We recommend using [JupyterLab](https://jupyterlab.readthedocs.io/) for best notebook tracking experience.
📬 Reach out to report issues, learn about data modules that connect your assays, pipelines & workflows within our data platform enterprise plan.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file lamindb-0.31.1.tar.gz
.
File metadata
- Download URL: lamindb-0.31.1.tar.gz
- Upload date:
- Size: 85.8 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: python-requests/2.28.2
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 0c5a6c13d7012f6333b285ba7c6455f844ea27a2edc8c681ce4ea6c5058e63ff |
|
MD5 | 32f02069338c4b603ed4b64bb23db69c |
|
BLAKE2b-256 | fae02657d81e54af7398c25e19dd45c9ca21f8d97b58820cbcc82df1800c5d80 |
Provenance
File details
Details for the file lamindb-0.31.1-py2.py3-none-any.whl
.
File metadata
- Download URL: lamindb-0.31.1-py2.py3-none-any.whl
- Upload date:
- Size: 47.2 kB
- Tags: Python 2, Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: python-requests/2.28.2
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | f2bf002efdcfb588341d8ae039368cf2d022e61716b12899a14e664a4732faac |
|
MD5 | 5b46fb7c2f8f8626793973824d7e2547 |
|
BLAKE2b-256 | ca1b45c76be49f3111502b5387bc30fa271c9679fb31fc5f0216d3d9252c4af0 |