Skip to main content

Windowed multiprocessing wrapper for rasterio

Project description

rio-mucho
=========

Parallel processing wrapper for rasterio

|PyPI| |Build Status| |Coverage Status|

Install
-------

From pypi:

``pip install rio-mucho``

From github (usually for a branch / dev):

``pip install pip install git+ssh://git@github.com/mapbox/rio-mucho.git@<branch>#egg=riomucho``

Development:

::

git clone git@github.com:mapbox/rio-mucho.git
cd rio-mucho
pip install -e .

Usage
-----

.. code:: python

with riomucho.RioMucho([{inputs}], {output}, {run function},
windows={windows},
global_args={global arguments},
options={options to write}) as rios:

rios.run({processes})

Arguments
~~~~~~~~~

``inputs``
^^^^^^^^^^

An list of file paths to open and read.

``output``
^^^^^^^^^^

What file to write to.

``run_function``
^^^^^^^^^^^^^^^^

A function to be applied to each window chunk. This should have input
arguments of:

1. A data input, which can be one of:

- A list of numpy arrays of shape (x,y,z), one for each file as
specified in input file list ``mode="simple_read" [default]``
- A numpy array of shape ({*n* input files x *n* band count}, {window
rows}, {window cols}) ``mode=array_read"``
- A list of open sources for reading ``mode="manual_read"``

2. A ``rasterio`` window tuple
3. A ``rasterio`` window index (``ij``)
4. A global arguments object that you can use to pass in global
arguments

This should return:

1. An output array of ({count}, {window rows}, {window cols}) shape, and
of the correct data type for writing

.. code:: python

def basic_run({data}, {window}, {ij}, {global args}):
## do something
return {out}

Keyword arguments
~~~~~~~~~~~~~~~~~

``windows={windows}``
^^^^^^^^^^^^^^^^^^^^^

A list of ``rasterio`` (window, ij) tuples to operate on.
``[Default = src[0].block_windows()]``

``global_args={global arguments}``
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Since this is working in parallel, any other objects / values that you
want to be accessible in the ``run_function``. ``[Default = {}]``

.. code:: python

global_args = {
'divide_value': 2
}

``options={keyword args}``
^^^^^^^^^^^^^^^^^^^^^^^^^^

The options to pass to the writing output. ``[Default = srcs[0].meta``

Example
-------

.. code:: python

import riomucho, rasterio, numpy

def basic_run(data, window, ij, g_args):
## do something
out = np.array(
[d[0] /= global_args['divide'] for d in data]
)
return out

# get windows from an input
with rasterio.open('/tmp/test_1.tif') as src:
## grabbing the windows as an example. Default behavior is identical.
windows = [[window, ij] for ij, window in src.block_windows()]
options = src.meta
# since we are only writing to 2 bands
options.update(count=2)

global_args = {
'divide': 2
}

processes = 4

# run it
with riomucho.RioMucho(['input1.tif','input2.tif'], 'output.tif', basic_run,
windows=windows,
global_args=global_args,
options=options) as rm:

rm.run(processes)

Utility functions
-----------------

\`riomucho.utils.array\_stack([array, array, array,...])
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Given a list of ({depth}, {rows}, {cols}) numpy arrays, stack into a
single (l{list length \* each image depth}, {rows}, {cols}) array. This
is useful for handling variation between ``rgb`` inputs of a single
file, or separate files for each.

One RGB file
^^^^^^^^^^^^

.. code:: python

files = ['rgb.tif']
open_files = [rasterio.open(f) for f in files]
rgb = `riomucho.utils.array_stack([src.read() for src in open_files])

Separate RGB files
^^^^^^^^^^^^^^^^^^

.. code:: python

files = ['r.tif', 'g.tif', 'b.tif']
open_files = [rasterio.open(f) for f in files]
rgb = `riomucho.utils.array_stack([src.read() for src in open_files])

.. |PyPI| image:: https://img.shields.io/pypi/v/rio-mucho.svg?maxAge=2592000?style=plastic
:target:
.. |Build Status| image:: https://travis-ci.org/mapbox/rio-mucho.svg?branch=master
:target: https://travis-ci.org/mapbox/rio-mucho
.. |Coverage Status| image:: https://coveralls.io/repos/mapbox/rio-mucho/badge.svg?branch=master&service=github
:target: https://coveralls.io/github/mapbox/rio-mucho?branch=master


Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

rio-mucho-1.0rc1.tar.gz (5.8 kB view details)

Uploaded Source

Built Distributions

rio_mucho-1.0rc1-py3-none-any.whl (5.8 kB view details)

Uploaded Python 3

rio_mucho-1.0rc1-py2-none-any.whl (5.8 kB view details)

Uploaded Python 2

File details

Details for the file rio-mucho-1.0rc1.tar.gz.

File metadata

  • Download URL: rio-mucho-1.0rc1.tar.gz
  • Upload date:
  • Size: 5.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for rio-mucho-1.0rc1.tar.gz
Algorithm Hash digest
SHA256 1f74d6e5c42500925945c3f478270c4a8723914cbdcbff7c470a683fb962a885
MD5 8debfb20a2bf11794ba17e849d86c324
BLAKE2b-256 0a99bc2d1a8982311a02658befe420d13b5f4dfbf92577f2e89a74b38890a31d

See more details on using hashes here.

File details

Details for the file rio_mucho-1.0rc1-py3-none-any.whl.

File metadata

File hashes

Hashes for rio_mucho-1.0rc1-py3-none-any.whl
Algorithm Hash digest
SHA256 532c620939f90d935d1ca96bc99dd230e7e6389a35370e0a9b40e7245e40c17a
MD5 fb8e959e06870482ae4c8fe3cf78bf34
BLAKE2b-256 2c7bc093ca37d82f93cdad1ee773bbb8b305485f0f972eb83b340154ae868a21

See more details on using hashes here.

File details

Details for the file rio_mucho-1.0rc1-py2-none-any.whl.

File metadata

File hashes

Hashes for rio_mucho-1.0rc1-py2-none-any.whl
Algorithm Hash digest
SHA256 9290923b8ed801442466ce4a6d9baa4b780ca656f6689182178b7bfe5c088c71
MD5 1f6b07c5841875cbab0c1264af490e71
BLAKE2b-256 a9c2f95ee41375fdf1eaf068053618108e0bc2d7e77ecf0b1c83a63e11c141e3

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page