CloudOptimized GeoTIFF (COGEO) creation plugin for rasterio
Project description
rio-cogeo
Cloud Optimized GeoTIFF (COG) creation and validation plugin for Rasterio
Cloud Optimized GeoTIFF
This plugin aim to facilitate the creation and validation of Cloud Optimized GeoTIFF (COG or COGEO). While it respects the COG specifications, this plugin also enforce several features:
- Internal overviews (User can remove overview with option
--overview-level 0
) - Internal tiles (default profiles have 512x512 internal tiles)
Important: Starting from GDAL 3.1 a new COG generator driver will be added (doc, discussion) and will make rio-cogeo
kinda obsolete.
Install
$ pip install -U pip
$ pip install rio-cogeo --pre # Version 2.0 currently in development
Or install from source:
$ git clone https://github.com/cogeotiff/rio-cogeo.git
$ cd rio-cogeo
$ pip install -U pip
$ pip install -e .
CLI
$ rio cogeo --help
Usage: rio cogeo [OPTIONS] COMMAND [ARGS]...
Rasterio cogeo subcommands.
Options:
--version Show the version and exit.
--help Show this message and exit.
Commands:
create Create COGEO
info Lists information about a raster dataset.
validate Validate COGEO
Create
$ rio cogeo create --help
Usage: rio cogeo create [OPTIONS] INPUT OUTPUT
Create Cloud Optimized Geotiff.
Options:
-b, --bidx BIDX Band indexes to copy.
-p, --cog-profile [jpeg|webp|zstd|lzw|deflate|packbits|lzma|lerc|lerc_deflate|lerc_zstd|raw]
CloudOptimized GeoTIFF profile (default: deflate).
--nodata NUMBER|nan Set nodata masking values for input dataset.
--add-mask Force output dataset creation with an internal mask (convert alpha band or nodata to mask).
--blocksize INTEGER Overwrite profile's tile size.
-t, --dtype [ubyte|uint8|uint16|int16|uint32|int32|float32|float64] Output data type.
--overview-level INTEGER Overview level (if not provided, appropriate overview level will be selected until the
smallest overview is smaller than the value of the internal blocksize)
--overview-resampling [nearest|bilinear|cubic|cubic_spline|lanczos|average|mode|gauss]
Overview creation resampling algorithm (default: nearest).
--overview-blocksize TEXT Overview's internal tile size (default defined by GDAL_TIFF_OVR_BLOCKSIZE env or 128)
-w, --web-optimized Create COGEO optimized for Web.
--latitude-adjustment / --global-maxzoom
Use dataset native mercator resolution for MAX_ZOOM calculation (linked to dataset
center latitude, default) or ensure MAX_ZOOM equality for multiple dataset accross latitudes.
-r, --resampling [nearest|bilinear|cubic|cubic_spline|lanczos|average|mode|gauss]
Resampling algorithm (default: nearest). Will only be applied with the `--web-optimized` option.
--in-memory / --no-in-memory Force processing raster in memory / not in memory (default: process in memory if smaller than 120 million pixels)
--allow-intermediate-compression Allow intermediate file compression to reduce memory/disk footprint.
--forward-band-tags Forward band tags to output bands.
--threads THREADS Number of worker threads for multi-threaded compression (default: ALL_CPUS)
--co, --profile NAME=VALUE Driver specific creation options. See the documentation for the selected output driver for more information.
--config NAME=VALUE GDAL configuration options.
-q, --quiet Remove progressbar and other non-error output.
--help Show this message and exit.
Validate
$ rio cogeo validate --help
Usage: rio cogeo validate [OPTIONS] INPUT
Validate Cloud Optimized Geotiff.
Options:
--strict Treat warnings as errors.
--help Show this message and exit.
The strict
options will treat warnings (e.g missing overviews) as errors.
Info
(extented version or rio info
).
$ rio cogeo info --help
Usage: rio cogeo info [OPTIONS] INPUT
Dataset info.
Options:
--json Print as JSON.
--help Show this message and exit.
Examples
# Create a COGEO with DEFLATE compression (Using default `Deflate` profile)
$ rio cogeo create mydataset.tif mydataset_jpeg.tif
# Validate COGEO
$ rio cogeo validate mydataset_jpeg.tif
# Create a COGEO with JPEG profile and the first 3 bands of the data and add internal mask
$ rio cogeo create mydataset.tif mydataset_jpeg.tif -b 1,2,3 --add-mask --cog-profile jpeg
# List Raster info
$ rio cogeo info mydataset_jpeg.tif
Driver: GTiff
File: mydataset_jpeg.tif
COG: True
Compression: DEFLATE
ColorSpace: None
Profile
Width: 10980
Height: 10980
Bands: 1
Tiled: True
Dtype: uint16
NoData: 0.0
Alpha Band: False
Internal Mask: False
Interleave: BAND
Colormap: False
Geo
Crs: EPSG:32634
Origin: (699960.0, 3600000.0)
Resolution: (10.0, -10.0)
BoundingBox: (699960.0, 3490200.0, 809760.0, 3600000.0)
IFD
Id Size BlockSize Decimation
0 10980x10980 1024x1024 0
1 5490x5490 128x128 2
2 2745x2745 128x128 4
3 1373x1373 128x128 8
4 687x687 128x128 16
Default COGEO profiles
Default profiles are tiled with 512x512 blocksizes.
JPEG
- JPEG compression
- PIXEL interleave
- YCbCr (3 bands) colorspace or MINISBLACK (1 band)
- limited to uint8 datatype and 3 bands data
WEBP
- WEBP compression
- PIXEL interleave
- limited to uint8 datatype and 3 or 4 bands data
- Non-Standard, might not be supported by software not build against GDAL+internal libtiff + libwebp
- Available for GDAL>=2.4.0
ZSTD
- ZSTD compression
- PIXEL interleave
- Non-Standard, might not be supported by software not build against GDAL + internal libtiff + libzstd
- Available for GDAL>=2.3.0
Note in Nov 2018, there was a change in libtiff's ZSTD tags which create incompatibility for old ZSTD compressed GeoTIFF (link)
LZW
- LZW compression
- PIXEL interleave
DEFLATE
- DEFLATE compression
- PIXEL interleave
PACKBITS
- PACKBITS compression
- PIXEL interleave
LZMA
- LZMA compression
- PIXEL interleave
LERC
- LERC compression
- PIXEL interleave
- Default MAX_Z_ERROR=0 (lossless)
- Non-Standard, might not be supported by software not build against GDAL + internal libtiff
- Available for GDAL>=2.4.0
LERC_DEFLATE
- LERC_DEFLATE compression
- PIXEL interleave
- Default MAX_Z_ERROR=0 (lossless)
- Non-Standard, might not be supported by software not build against GDAL + internal libtiff + libzstd
- Available for GDAL>=2.4.0
LERC_ZSTD
- LERC_ZSTD compression
- PIXEL interleave
- Default MAX_Z_ERROR=0 (lossless)
- Non-Standard, might not be supported by software not build against GDAL + internal libtiff + libzstd
- Available for GDAL>=2.4.0
RAW
- NO compression
- PIXEL interleave
Profiles can be extended by providing '--co' option in command line
# Create a COGEO without compression and with 1024x1024 block size and 256 overview blocksize
$ rio cogeo create mydataset.tif mydataset_raw.tif --co BLOCKXSIZE=1024 --co BLOCKYSIZE=1024 --cog-profile raw --overview-blocksize 256
See https://gdal.org/drivers/raster/gtiff.html#creation-options for full details of creation options.
API
Rio-cogeo can also be integrated directly in your custom script. See rio_cogeo.cogeo.cog_translate function.
e.g:
from rio_cogeo.cogeo import cog_translate
def _translate(src_path, dst_path, profile="webp", profile_options={}, **options):
"""Convert image to COG."""
# Format creation option (see gdalwarp `-co` option)
output_profile = cog_profiles.get(profile)
output_profile.update(dict(BIGTIFF="IF_SAFER"))
output_profile.update(profile_options)
# Dataset Open option (see gdalwarp `-oo` option)
config = dict(
GDAL_NUM_THREADS="ALL_CPUS",
GDAL_TIFF_INTERNAL_MASK=True,
GDAL_TIFF_OVR_BLOCKSIZE="128",
)
cog_translate(
src_path,
dst_path,
output_profile,
config=config,
in_memory=False,
quiet=True,
**options,
)
return True
Using the API with in MemoryFile
- Create COG from numpy array
import numpy
import mercantile
from rasterio.io import MemoryFile
from rasterio.transform import from_bounds
from rio_cogeo.cogeo import cog_translate
from rio_cogeo.profiles import cog_profiles
# Create GeoTIFF profile
bounds = mercantile.bounds(mercantile.Tile(0,0,0))
src_transform = from_bounds(*bounds, 1024 1024)
src_profile = dict(
driver="GTiff",
dtype="float32",
count=3,
height=1024,
width=1024,
crs="epsg:4326",
transform=dst_transform,
)
img_array = tile = numpy.random.rand(3, 1024, 1024)
with MemoryFile() as memfile:
with memfile.open(**src_profile) as mem:
# Populate the input file with numpy array
mem.write(img_array)
dst_profile = cog_profiles.get("deflate")
cog_translate(
mem,
"my-output-cog.tif",
dst_profile,
in_memory=True,
quiet=True,
)
- Create output COG in Memory
from rasterio.io import MemoryFile
from rio_cogeo.cogeo import cog_translate
from rio_cogeo.profiles import cog_profiles
from boto3.session import Session as boto3_session
dst_profile = cog_profiles.get("deflate")
with MemoryFile() as mem_dst:
# Important, we pass `mem_dst.name` as output dataset path
cog_translate("my-input-file.tif", mem_dst.name, profile, in_memory=True)
# You can then use the memoryfile to do something else like
# upload to AWS S3
client = boto3_session.client("s3")
client.upload_fileobj(mem_dst, "my-bucket", "my-key")
Web-Optimized COG
rio-cogeo provide a --web-optimized option which aims to create a web-tiling friendly COG.
Output dataset features:
- bounds and internal tiles aligned with web-mercator grid.
- raw data and overviews resolution match mercator zoom level resolution.
Important
Because the mercator projection does not respect the distance, when working with multiple images covering different latitudes, you may want to use the --global-maxzoom option to create output dataset having the same MAX_ZOOM (raw data resolution).
Because it will certainly create a larger file, a nodata value or alpha band should be present in the input dataset. If not the original data will be surrounded by black (0) data.
Internal tile size
By default rio cogeo will create a dataset with 512x512 internal tile size.
This can be updated by passing --co BLOCKXSIZE=64 --co BLOCKYSIZE=64
options.
Web tiling optimization
if the input dataset is aligned to web mercator grid, the internal tile size should be equal to the web map tile size (256 or 512px). Dataset should be compressed.
if the input dataset is not aligned to web mercator grid, the tiler will need to fetch multiple internal tiles. Because GDAL can merge range request, using small internal tiles (e.g 128) will reduce the number of byte transfered and minimized the useless bytes transfered.
GDAL configuration to merge consecutive range requests
GDAL_HTTP_MERGE_CONSECUTIVE_RANGES=YES
GDAL_HTTP_MULTIPLEX=YES
GDAL_HTTP_VERSION=2
Overview levels
By default rio cogeo will calculate the optimal overview level based on dataset size and internal tile size (overview should not be smaller than internal tile size (e.g 512px). Overview level will be translated to decimation level of power of two:
overview_level = 3
overviews = [2 ** j for j in range(1, overview_level + 1)]
print(overviews)
[2, 4, 8]
Band metadata
By default rio cogeo DO NOT forward band metadata (e.g statistics) to the output dataset.
$ gdalinfo my_file.tif
...
Band 1 Block=576x1 Type=Float64, ColorInterp=Gray
NoData Value=999999986991104
Unit Type: mol mol-1
Metadata:
long_name=CO2 Dry-Air Column Average
missing_value=9.9999999e+14
NETCDF_DIM_time=0
NETCDF_VARNAME=XCO2MEAN
units=mol mol-1
_FillValue=9.9999999e+14
$ rio cogeo my_file.tif my_cog.tif --blocksize 256
$ gdalinfo my_cog.tif
...
Band 1 Block=256x256 Type=Float64, ColorInterp=Gray
NoData Value=999999986991104
Overviews: 288x181
You can use --forward-band-tags
to forwards the band metadata to output dataset.
$ rio cogeo create my_file.tif my_cog.tif --blocksize 256 --forward-band-tags
$ gdalinfo my_cog.tif
...
Band 1 Block=256x256 Type=Float64, ColorInterp=Gray
NoData Value=999999986991104
Overviews: 288x181
Metadata:
long_name=CO2 Dry-Air Column Average
missing_value=9.9999999e+14
NETCDF_DIM_time=0
NETCDF_VARNAME=XCO2MEAN
units=mol mol-1
_FillValue=9.9999999e+14
Nodata, Alpha and Mask
By default rio-cogeo will forward any nodata value or alpha channel to the output COG.
If your dataset type is Byte or Unit16, you could use internal bit mask
(with the --add-mask
option) to replace the Nodata value or Alpha band in
output dataset (supported by most GDAL based backends).
Note: when adding a mask
with an input dataset having an alpha band you'll
need to use the bidx
options to remove it from the output dataset.
# Replace the alpha band by an internal mask
$ rio cogeo mydataset_withalpha.tif mydataset_withmask.tif --cog-profile raw --add-mask --bidx 1,2,3
Important
Using internal nodata value with lossy compression (webp
, jpeg
) is not
recommanded. Please use internal masking (or alpha band if using webp).
GDAL Version
It is recommanded to use GDAL > 2.3.2. Previous version might not be able to create proper COGs (ref: https://github.com/OSGeo/gdal/issues/754).
More info in https://github.com/cogeotiff/rio-cogeo/issues/55
Contribution & Development
The rio-cogeo project was begun at Mapbox and has been transferred to the CogeoTIFF organization in January 2019.
Issues and pull requests are more than welcome.
dev install
$ git clone https://github.com/cogeotiff/rio-cogeo.git
$ cd rio-cogeo
$ pip install -e .[dev]
Python >=3.7 only
This repo is set to use pre-commit
to run isort, flake8, pydocstring, black ("uncompromising Python code formatter") and mypy when committing new code.
$ pre-commit install
$ git add .
$ git commit -m'my change'
isort....................................................................Passed
black....................................................................Passed
Flake8...................................................................Passed
Verifying PEP257 Compliance..............................................Passed
mypy.....................................................................Passed
Extras
Blog post on good and bad COG formats: https://medium.com/@_VincentS_/do-you-really-want-people-using-your-data-ec94cd94dc3f
Checkout rio-glui or rio-viz rasterio plugins to explore COG locally in your web browser.
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
File details
Details for the file rio-cogeo-2.0a5.tar.gz
.
File metadata
- Download URL: rio-cogeo-2.0a5.tar.gz
- Upload date:
- Size: 24.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/40.8.0 requests-toolbelt/0.9.1 tqdm/4.48.0 CPython/3.7.2
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 2cf183c2008e1399a3cf96360e7d39958c292a762bd52727a5c49f4117548a3d |
|
MD5 | 9655476769a738f6ffbb6ed0e8585ef9 |
|
BLAKE2b-256 | 3164edbcf9a577af69ce49ed18fa4bbb455c26e0d327fd606a53a18cfa1ae861 |