Windowed multiprocessing wrapper for rasterio
Project description
Parallel processing wrapper for rasterio
Install
From pypi:
pip install rio-mucho --pre
From github (usually for a branch / dev):
pip install pip install git+ssh://git@github.com/mapbox/rio-mucho.git@<branch>
Development:
git clone git@github.com:mapbox/rio-mucho.git cd rio-mucho pip install -e .
Usage
with riomucho.RioMucho([{inputs}], {output}, {run function},
windows={windows},
global_args={global arguments},
kwargs={kwargs to write}) as rios:
rios.run({processes})
Arguments
inputs
An list of file paths to open and read.
output
What file to write to.
run_function
A function to be applied to each window chunk. This should have input arguments of:
A data input, which can be one of:
A list of numpy arrays of shape (x,y,z), one for each file as specified in input file list mode="simple_read" [default]
A numpy array of shape ({n input files x n band count}, {window rows}, {window cols}) mode=array_read"
A list of open sources for reading mode="manual_read"
A rasterio window tuple
A rasterio window index (ij)
A global arguments object that you can use to pass in global arguments
This should return:
An output array of ({count}, {window rows}, {window cols}) shape, and of the correct data type for writing
def basic_run({data}, {window}, {ij}, {global args}):
## do something
return {out}
Keyword arguments
windows={windows}
A list of rasterio (window, ij) tuples to operate on. [Default = src[0].block_windows()]
global_args={global arguments}
Since this is working in parallel, any other objects / values that you want to be accessible in the run_function. [Default = {}]
global_args = {
'divide_value': 2
}
kwargs={keyword args}
The kwargs to pass to the output. [Default = srcs[0].kwargs
Example
import riomucho, rasterio, numpy
def basic_run(data, window, ij, g_args):
## do something
out = np.array(
[d[0] /= global_args['divide'] for d in data]
)
return out
# get windows from an input
with rasterio.open('/tmp/test_1.tif') as src:
## grabbing the windows as an example. Default behavior is identical.
windows = [[window, ij] for ij, window in src.block_windows()]
kwargs = src.meta
# since we are only writing to 2 bands
kwargs.update(count=2)
global_args = {
'divide': 2
}
processes = 4
# run it
with riomucho.RioMucho(['input1.tif','input2.tif'], 'output.tif', basic_run,
windows=windows,
global_args=global_args,
kwargs=kwargs) as rm:
rm.run(processes)
Utility functions
`riomucho.utils.array_stack([array, array, array,…])
Given a list of ({depth}, {rows}, {cols}) numpy arrays, stack into a single (l{list length * each image depth}, {rows}, {cols}) array. This is useful for handling variation between rgb inputs of a single file, or separate files for each.
One RGB file
files = ['rgb.tif']
open_files = [rasterio.open(f) for f in files]
rgb = `riomucho.utils.array_stack([src.read() for src in open_files])
Separate RGB files
files = ['r.tif', 'g.tif', 'b.tif']
open_files = [rasterio.open(f) for f in files]
rgb = `riomucho.utils.array_stack([src.read() for src in open_files])
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
File details
Details for the file rio-mucho-0.0.1.tar.gz
.
File metadata
- Download URL: rio-mucho-0.0.1.tar.gz
- Upload date:
- Size: 4.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 9efca675f1b1cb2b898dae836eede3fca106f8802b6681ceae9b0b6bcd322774 |
|
MD5 | d48b340330cb4cc6e203ccb8c5262701 |
|
BLAKE2b-256 | 1313951b200616c832ec884b9e64e629913e3c7c114a4c9e8cdfe100a9c87485 |