Open-source data platform for biology.
Project description
LaminDB - Open-source data platform for biology
Public beta: Close to having converged a stable API, but some breaking changes might still occur.
Here is an [intro video](https://www.youtube.com/watch?v=DtJ9KnqWA8Q) to guide beta testing.
LaminDB is a Python library to manage data & analyses related to biology:
- Query, validate & link data batches using biological registries & ontologies.
- Track & query data lineage across pipelines, notebooks & app uploads.
- Manage features & labels schema-less or schema-full.
- Collaborate across a mesh of LaminDB instances.
If you want a UI: LaminApp is built on LaminDB. If LaminDB ~ git, LaminApp ~ GitHub.
(Enterprise features for LaminApp, support, integration tests & schemas are available on a paid plan - in your or our infrastructure.)
Quickstart
Run pip install 'lamindb[jupyter]'
and lamin signup <email>
on the command line (more info).
Init a LaminDB instance with local or cloud default storage like you'd init a git repository:
$ lamin init --storage ./mydata # or s3://my-bucket, gs://my-bucket
Validate & register a DataFrame
that comes with basic metadata:
import lamindb as ln
import pandas as pd
ln.track() # track run context in a notebook
# save target feature names in Feature registry
features = ln.Feature.from_values(["feature1", "feature2", "perturbation"])
ln.save(features)
# receive a batch of data
df = pd.DataFrame(
{"feature1": [1, 2, 3], "feature2": [3, 4, 5], "perturbation": ["pert1", "pert2", "pert1"]}
)
# validate features & create a Dataset object
dataset = ln.Dataset.from_df(df, name="Dataset 1")
dataset.save() # save/upload dataset
Search, query, and load a DataFrame
:
ln.Dataset.search("dataset 1") # run a search
# run a query (under the hood, you have the full power of SQL to query)
dataset = ln.Dataset.filter(name__contains="set 1").one()
df = dataset.load()
Documentation
Read the docs.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
lamindb-0.51.1.tar.gz
(253.4 kB
view hashes)
Built Distribution
lamindb-0.51.1-py3-none-any.whl
(84.1 kB
view hashes)