Skip to main content

Generate a multiscale, chunked, multi-dimensional spatial image data structure that can be serialized to OME-NGFF.

Project description

multiscale-spatial-image

Test Notebook tests image image DOI

Generate a multiscale, chunked, multi-dimensional spatial image data structure that can serialized to OME-NGFF.

Each scale is a scientific Python Xarray spatial-image Dataset, organized into nodes of an Xarray Datatree.

Installation

pip install multiscale_spatial_image

Usage

import numpy as np
from spatial_image import to_spatial_image
from multiscale_spatial_image import to_multiscale
import zarr

# Image pixels
array = np.random.randint(0, 256, size=(128,128), dtype=np.uint8)

image = to_spatial_image(array)
print(image)

An Xarray spatial-image DataArray. Spatial metadata can also be passed during construction.

<xarray.SpatialImage 'image' (y: 128, x: 128)>
array([[114,  47, 215, ..., 245,  14, 175],
       [ 94, 186, 112, ...,  42,  96,  30],
       [133, 170, 193, ..., 176,  47,   8],
       ...,
       [202, 218, 237, ...,  19, 108, 135],
       [ 99,  94, 207, ..., 233,  83, 112],
       [157, 110, 186, ..., 142, 153,  42]], dtype=uint8)
Coordinates:
  * y        (y) float64 0.0 1.0 2.0 3.0 4.0 ... 123.0 124.0 125.0 126.0 127.0
  * x        (x) float64 0.0 1.0 2.0 3.0 4.0 ... 123.0 124.0 125.0 126.0 127.0
# Create multiscale pyramid, downscaling by a factor of 2, then 4
multiscale = to_multiscale(image, [2, 4])
print(multiscale)

A chunked Dask Array MultiscaleSpatialImage Xarray Datatree.

DataTree('multiscales', parent=None)
├── DataTree('scale0')
│   Dimensions:  (y: 128, x: 128)
│   Coordinates:
│     * y        (y) float64 0.0 1.0 2.0 3.0 4.0 ... 123.0 124.0 125.0 126.0 127.0
│     * x        (x) float64 0.0 1.0 2.0 3.0 4.0 ... 123.0 124.0 125.0 126.0 127.0
│   Data variables:
│       image    (y, x) uint8 dask.array<chunksize=(128, 128), meta=np.ndarray>
├── DataTree('scale1')
│   Dimensions:  (y: 64, x: 64)
│   Coordinates:
│     * y        (y) float64 0.5 2.5 4.5 6.5 8.5 ... 118.5 120.5 122.5 124.5 126.5
│     * x        (x) float64 0.5 2.5 4.5 6.5 8.5 ... 118.5 120.5 122.5 124.5 126.5
│   Data variables:
│       image    (y, x) uint8 dask.array<chunksize=(64, 64), meta=np.ndarray>
└── DataTree('scale2')
    Dimensions:  (y: 16, x: 16)
    Coordinates:
      * y        (y) float64 3.5 11.5 19.5 27.5 35.5 ... 91.5 99.5 107.5 115.5 123.5
      * x        (x) float64 3.5 11.5 19.5 27.5 35.5 ... 91.5 99.5 107.5 115.5 123.5
    Data variables:
        image    (y, x) uint8 dask.array<chunksize=(16, 16), meta=np.ndarray>

Map a function over datasets while skipping nodes that do not contain dimensions

import numpy as np
from spatial_image import to_spatial_image
from multiscale_spatial_image import skip_non_dimension_nodes, to_multiscale

data = np.zeros((2, 200, 200))
dims = ("c", "y", "x")
scale_factors = [2, 2]
image = to_spatial_image(array_like=data, dims=dims)
multiscale = to_multiscale(image, scale_factors=scale_factors)

@skip_non_dimension_nodes
def transpose(ds, *args, **kwargs):
    return ds.transpose(*args, **kwargs)

multiscale = multiscale.map_over_datasets(transpose, "y", "x", "c")
print(multiscale)

A transposed MultiscaleSpatialImage.

<xarray.DataTree>
Group: /
├── Group: /scale0
│       Dimensions:  (c: 2, y: 200, x: 200)
│       Coordinates:
│         * c        (c) int32 8B 0 1
│         * y        (y) float64 2kB 0.0 1.0 2.0 3.0 4.0 ... 196.0 197.0 198.0 199.0
│         * x        (x) float64 2kB 0.0 1.0 2.0 3.0 4.0 ... 196.0 197.0 198.0 199.0
│       Data variables:
│           image    (y, x, c) float64 640kB dask.array<chunksize=(200, 200, 2), meta=np.ndarray>
├── Group: /scale1
│       Dimensions:  (c: 2, y: 100, x: 100)
│       Coordinates:
│         * c        (c) int32 8B 0 1
│         * y        (y) float64 800B 0.5 2.5 4.5 6.5 8.5 ... 192.5 194.5 196.5 198.5
│         * x        (x) float64 800B 0.5 2.5 4.5 6.5 8.5 ... 192.5 194.5 196.5 198.5
│       Data variables:
│           image    (y, x, c) float64 160kB dask.array<chunksize=(100, 100, 2), meta=np.ndarray>
└── Group: /scale2
        Dimensions:  (c: 2, y: 50, x: 50)
        Coordinates:
          * c        (c) int32 8B 0 1
          * y        (y) float64 400B 1.5 5.5 9.5 13.5 17.5 ... 185.5 189.5 193.5 197.5
          * x        (x) float64 400B 1.5 5.5 9.5 13.5 17.5 ... 185.5 189.5 193.5 197.5
        Data variables:
            image    (y, x, c) float64 40kB dask.array<chunksize=(50, 50, 2), meta=np.ndarray>

Store as an Open Microscopy Environment-Next Generation File Format (OME-NGFF) / netCDF Zarr store.

It is highly recommended to use dimension_separator='/' in the construction of the Zarr stores.

store = zarr.storage.DirectoryStore('multiscale.zarr', dimension_separator='/')
multiscale.to_zarr(store)

Note: The API is under development, and it may change until 1.0.0 is released. We mean it :-).

Examples

Development

Contributions are welcome and appreciated.

Get the source code

git clone https://github.com/spatial-image/multiscale-spatial-image
cd multiscale-spatial-image

Install dependencies

First install pixi. Then, install project dependencies:

pixi install -a
pixi run pre-commit-install

Run the test suite

The unit tests:

pixi run -e test test

The notebooks tests:

pixi run test-notebooks

Update test data

To add new or update testing data, such as a new baseline for this block:

dataset_name = "cthead1"
image = input_images[dataset_name]
baseline_name = "2_4/XARRAY_COARSEN"
multiscale = to_multiscale(image, [2, 4], method=Methods.XARRAY_COARSEN)
verify_against_baseline(test_data_dir, dataset_name, baseline_name, multiscale)

Add a store_new_image call in your test block:

dataset_name = "cthead1"
image = input_images[dataset_name]
baseline_name = "2_4/XARRAY_COARSEN"
multiscale = to_multiscale(image, [2, 4], method=Methods.XARRAY_COARSEN)

store_new_image(dataset_name, baseline_name, multiscale)

verify_against_baseline(dataset_name, baseline_name, multiscale)

Run the tests to generate the output. Remove the store_new_image call.

Then, create a tarball of the current testing data

cd test/data
tar cvf ../data.tar *
gzip -9 ../data.tar
python3 -c 'import pooch; print(pooch.file_hash("../data.tar.gz"))'

Update the test_data_sha256 variable in the test/_data.py file. Upload the data to web3.storage. And update the test_data_ipfs_cid Content Identifier (CID) variable, which is available in the web3.storage web page interface.

Submit the patch

We use the standard GitHub flow.

Create a release

This section is relevant only for maintainers.

  1. Pull git's main branch.
  2. pixi install -a
  3. pixi run pre-commit-install
  4. pixi run -e test test
  5. pixi shell
  6. hatch version <new-version>
  7. git add .
  8. git commit -m "ENH: Bump version to <version>"
  9. hatch build
  10. hatch publish
  11. git push upstream main
  12. Create a new tag and Release via the GitHub UI. Auto-generate release notes and add additional notes as needed.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

multiscale_spatial_image-2.0.1.tar.gz (1.3 MB view details)

Uploaded Source

Built Distribution

multiscale_spatial_image-2.0.1-py3-none-any.whl (26.6 kB view details)

Uploaded Python 3

File details

Details for the file multiscale_spatial_image-2.0.1.tar.gz.

File metadata

File hashes

Hashes for multiscale_spatial_image-2.0.1.tar.gz
Algorithm Hash digest
SHA256 acaa20d5a5f29322260c01a9988ba635d27239e651ba0dee8a177c084161a570
MD5 8754f109884758b57655e9a5a07e83bb
BLAKE2b-256 3c12bba3e084cd75e5d21672ba12150b4bfbffebc426325029b0723e26c43049

See more details on using hashes here.

File details

Details for the file multiscale_spatial_image-2.0.1-py3-none-any.whl.

File metadata

File hashes

Hashes for multiscale_spatial_image-2.0.1-py3-none-any.whl
Algorithm Hash digest
SHA256 8e6d96d600fe23bbe996f0d2f8ed2294b0b23874bbcc5583d51d7222463bd8ba
MD5 4fb5c48d63d0f2975d70348cc07d9697
BLAKE2b-256 387ca2454b23d475b12e80f6cf645f03d330a2d49c4af535a1e1306ec1cef85b

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page