Skip to main content

No project description provided

Project description

LINDI - Linked Data Interface

latest-release tests codecov

:warning: Please note, LINDI is currently under development and should not yet be used in practice.

HDF5 as Zarr as JSON for NWB

LINDI provides a JSON representation of NWB (Neurodata Without Borders) data where the large data chunks are stored separately from the main metadata. This enables efficient storage, composition, and sharing of NWB files on cloud systems such as DANDI without duplicating the large data blobs.

LINDI provides:

  • A specification for representing arbitrary HDF5 files as Zarr stores. This handles scalar datasets, references, soft links, and compound data types for datasets.
  • A Zarr wrapper for remote or local HDF5 files (LindiH5ZarrStore).
  • A mechanism for creating .lindi.json (or .nwb.lindi.json) files that reference data chunks in external files, inspired by kerchunk.
  • An h5py-like interface for reading from and writing to these data sources that can be used with pynwb.
  • A mechanism for uploading and downloading these data sources to and from cloud storage, including DANDI.

This project was inspired by kerchunk and hdmf-zarr and depends on zarr, h5py and numcodecs.

Installation

pip install lindi

Or from source

cd lindi
pip install -e .

Use cases

  • Lazy-load a remote NWB/HDF5 file for efficient access to metadata and data.
  • Represent a remote NWB/HDF5 file as a .nwb.lindi.json file.
  • Read a local or remote .nwb.lindi.json file using pynwb or other tools.
  • Edit a .nwb.lindi.json file using pynwb or other tools.
  • Add datasets to a .nwb.lindi.json file using a local staging area.
  • Upload a .nwb.lindi.json file with staged datasets to a cloud storage service such as DANDI.

Lazy-load a remote NWB/HDF5 file for efficient access to metadata and data

import pynwb
import lindi

# URL of the remote NWB file
h5_url = "https://api.dandiarchive.org/api/assets/11f512ba-5bcf-4230-a8cb-dc8d36db38cb/download/"

# Set up a local cache
local_cache = lindi.LocalCache(cache_dir='lindi_cache')

# Create the h5py-like client
client = lindi.LindiH5pyFile.from_hdf5_file(h5_url, local_cache=local_cache)

# Open using pynwb
with pynwb.NWBHDF5IO(file=client, mode="r") as io:
    nwbfile = io.read()
    print(nwbfile)

# The downloaded data will be cached locally, so subsequent reads will be faster

Represent a remote NWB/HDF5 file as a .nwb.lindi.json file

import json
import lindi

# URL of the remote NWB file
h5_url = "https://api.dandiarchive.org/api/assets/11f512ba-5bcf-4230-a8cb-dc8d36db38cb/download/"

# Create the h5py-like client
client = lindi.LindiH5pyFile.from_hdf5_file(h5_url)

client.write_lindi_file('example.lindi.json')

# See the next example for how to read this file

Read a local or remote .nwb.lindi.json file using pynwb or other tools

import pynwb
import lindi

# URL of the remote .nwb.lindi.json file
url = 'https://lindi.neurosift.org/dandi/dandisets/000939/assets/56d875d6-a705-48d3-944c-53394a389c85/nwb.lindi.json'

# Load the h5py-like client
client = lindi.LindiH5pyFile.from_lindi_file(url)

# Open using pynwb
with pynwb.NWBHDF5IO(file=client, mode="r") as io:
    nwbfile = io.read()
    print(nwbfile)

Edit a .nwb.lindi.json file using pynwb or other tools

import json
import lindi

# URL of the remote .nwb.lindi.json file
url = 'https://lindi.neurosift.org/dandi/dandisets/000939/assets/56d875d6-a705-48d3-944c-53394a389c85/nwb.lindi.json'

# Load the h5py-like client for the reference file system
# in read-write mode
client = lindi.LindiH5pyFile.from_lindi_file(url, mode="r+")

# Edit an attribute
client.attrs['new_attribute'] = 'new_value'

# Save the changes to a new .nwb.lindi.json file
client.write_lindi_file('new.nwb.lindi.json')

Add datasets to a .nwb.lindi.json file using a local staging area

import lindi

# URL of the remote .nwb.lindi.json file
url = 'https://lindi.neurosift.org/dandi/dandisets/000939/assets/56d875d6-a705-48d3-944c-53394a389c85/nwb.lindi.json'

# Load the h5py-like client for the reference file system
# in read-write mode with a staging area
with lindi.StagingArea.create(base_dir='lindi_staging') as staging_area:
    client = lindi.LindiH5pyFile.from_lindi_file(
        url,
        mode="r+",
        staging_area=staging_area
    )
    # add datasets to client using pynwb or other tools
    # upload the changes to the remote .nwb.lindi.json file

Upload a .nwb.lindi.json file with staged datasets to a cloud storage service such as DANDI

See this example.

For developers

Special Zarr annotations used by LINDI

License

See LICENSE.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

lindi-0.3.13.tar.gz (46.5 kB view details)

Uploaded Source

Built Distribution

lindi-0.3.13-py3-none-any.whl (56.5 kB view details)

Uploaded Python 3

File details

Details for the file lindi-0.3.13.tar.gz.

File metadata

  • Download URL: lindi-0.3.13.tar.gz
  • Upload date:
  • Size: 46.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.7.1 CPython/3.9.18 Linux/6.5.6-76060506-generic

File hashes

Hashes for lindi-0.3.13.tar.gz
Algorithm Hash digest
SHA256 1ac7e1f5165463b0f0e34564cffefa9f217e9a0c6b82c65959e778e267200b2e
MD5 8661c69616d1ee7c30e064645acc02d0
BLAKE2b-256 b3a2ddd48f3f1e6e50e4d9c279e311da641a45a210ae4ae5bb08c199347185b7

See more details on using hashes here.

Provenance

File details

Details for the file lindi-0.3.13-py3-none-any.whl.

File metadata

  • Download URL: lindi-0.3.13-py3-none-any.whl
  • Upload date:
  • Size: 56.5 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: poetry/1.7.1 CPython/3.9.18 Linux/6.5.6-76060506-generic

File hashes

Hashes for lindi-0.3.13-py3-none-any.whl
Algorithm Hash digest
SHA256 546b5e814c0204d335c9f87dc150d824e93a0fad59630c992a90bf504b0ee017
MD5 528446711324d73bd0d410a871f87882
BLAKE2b-256 a67cfaf54afaac5b2bc4a36ae7fb9ccd6ee809a2585c566df9af49d75bbecf90

See more details on using hashes here.

Provenance

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page