Skip to main content

TensorFlow IO

Project description




TensorFlow I/O

GitHub CI PyPI License Documentation

TensorFlow I/O is a collection of file systems and file formats that are not available in TensorFlow's built-in support. A full list of supported file systems and file formats by TensorFlow I/O can be found here.

The use of tensorflow-io is straightforward with keras. Below is an example to Get Started with TensorFlow with the data processing aspect replaced by tensorflow-io:

import tensorflow as tf
import tensorflow_io as tfio

# Read the MNIST data into the IODataset.
dataset_url = "http://storage.googleapis.com/cvdf-datasets/mnist/"
d_train = tfio.IODataset.from_mnist(
    dataset_url + "train-images-idx3-ubyte.gz",
    dataset_url + "train-labels-idx1-ubyte.gz",
)

# Shuffle the elements of the dataset.
d_train = d_train.shuffle(buffer_size=1024)

# By default image data is uint8, so convert to float32 using map().
d_train = d_train.map(lambda x, y: (tf.image.convert_image_dtype(x, tf.float32), y))

# prepare batches the data just like any other tf.data.Dataset
d_train = d_train.batch(32)

# Build the model.
model = tf.keras.models.Sequential(
    [
        tf.keras.layers.Flatten(input_shape=(28, 28)),
        tf.keras.layers.Dense(512, activation=tf.nn.relu),
        tf.keras.layers.Dropout(0.2),
        tf.keras.layers.Dense(10, activation=tf.nn.softmax),
    ]
)

# Compile the model.
model.compile(
    optimizer="adam", loss="sparse_categorical_crossentropy", metrics=["accuracy"]
)

# Fit the model.
model.fit(d_train, epochs=5, steps_per_epoch=200)

In the above MNIST example, the URL's to access the dataset files are passed directly to the tfio.IODataset.from_mnist API call. This is due to the inherent support that tensorflow-io provides for the HTTP file system, thus eliminating the need for downloading and saving datasets on a local directory.

NOTE: Since tensorflow-io is able to detect and uncompress the MNIST dataset automatically if needed, we can pass the URL's for the compressed files (gzip) to the API call as is.

Please check the official documentation for more detailed and interesting usages of the package.

Installation

Python Package

The tensorflow-io Python package can be installed with pip directly using:

$ pip install tensorflow-io

People who are a little more adventurous can also try our nightly binaries:

$ pip install tensorflow-io-nightly

Docker Images

In addition to the pip packages, the docker images can be used to quickly get started.

For stable builds:

$ docker pull tfsigio/tfio:latest
$ docker run -it --rm --name tfio-latest tfsigio/tfio:latest

For nightly builds:

$ docker pull tfsigio/tfio:nightly
$ docker run -it --rm --name tfio-nightly tfsigio/tfio:nightly

R Package

Once the tensorflow-io Python package has been successfully installed, you can install the development version of the R package from GitHub via the following:

if (!require("remotes")) install.packages("remotes")
remotes::install_github("tensorflow/io", subdir = "R-package")

TensorFlow Version Compatibility

To ensure compatibility with TensorFlow, it is recommended to install a matching version of TensorFlow I/O according to the table below. You can find the list of releases here.

TensorFlow I/O Version TensorFlow Compatibility Release Date
0.17.0 2.4.x Dec 14, 2020
0.16.0 2.3.x Oct 23, 2020
0.15.0 2.3.x Aug 03, 2020
0.14.0 2.2.x Jul 08, 2020
0.13.0 2.2.x May 10, 2020
0.12.0 2.1.x Feb 28, 2020
0.11.0 2.1.x Jan 10, 2020
0.10.0 2.0.x Dec 05, 2019
0.9.1 2.0.x Nov 15, 2019
0.9.0 2.0.x Oct 18, 2019
0.8.1 1.15.x Nov 15, 2019
0.8.0 1.15.x Oct 17, 2019
0.7.2 1.14.x Nov 15, 2019
0.7.1 1.14.x Oct 18, 2019
0.7.0 1.14.x Jul 14, 2019
0.6.0 1.13.x May 29, 2019
0.5.0 1.13.x Apr 12, 2019
0.4.0 1.13.x Mar 01, 2019
0.3.0 1.12.0 Feb 15, 2019
0.2.0 1.12.0 Jan 29, 2019
0.1.0 1.12.0 Dec 16, 2018

Performance Benchmarking

We use github-pages to document the results of API performance benchmarks. The benchmark job is triggered on every commit to master branch and facilitates tracking performance w.r.t commits.

Contributing

Tensorflow I/O is a community led open source project. As such, the project depends on public contributions, bug-fixes, and documentation. Please see:

Build Status and CI

Build Status
Linux CPU Python 2 Status
Linux CPU Python 3 Status
Linux GPU Python 2 Status
Linux GPU Python 3 Status

Because of manylinux2010 requirement, TensorFlow I/O is built with Ubuntu:16.04 + Developer Toolset 7 (GCC 7.3) on Linux. Configuration with Ubuntu 16.04 with Developer Toolset 7 is not exactly straightforward. If the system have docker installed, then the following command will automatically build manylinux2010 compatible whl package:

#!/usr/bin/env bash

ls dist/*
for f in dist/*.whl; do
  docker run -i --rm -v $PWD:/v -w /v --net=host quay.io/pypa/manylinux2010_x86_64 bash -x -e /v/tools/build/auditwheel repair --plat manylinux2010_x86_64 $f
done
sudo chown -R $(id -nu):$(id -ng) .
ls wheelhouse/*

It takes some time to build, but once complete, there will be python 3.5, 3.6, 3.7 compatible whl packages available in wheelhouse directory.

On macOS, the same command could be used. However, the script expects python in shell and will only generate a whl package that matches the version of python in shell. If you want to build a whl package for a specific python then you have to alias this version of python to python in shell. See .github/workflows/build.yml Auditwheel step for instructions how to do that.

Note the above command is also the command we use when releasing packages for Linux and macOS.

TensorFlow I/O uses both GitHub Workflows and Google CI (Kokoro) for continuous integration. GitHub Workflows is used for macOS build and test. Kokoro is used for Linux build and test. Again, because of the manylinux2010 requirement, on Linux whl packages are always built with Ubuntu 16.04 + Developer Toolset 7. Tests are done on a variatiy of systems with different python3 versions to ensure a good coverage:

Python Ubuntu 18.04 Ubuntu 20.04 macOS + osx9 Windows-2019
2.7 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: N/A
3.7 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: :heavy_check_mark:
3.8 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: :heavy_check_mark:

TensorFlow I/O has integrations with many systems and cloud vendors such as Prometheus, Apache Kafka, Apache Ignite, Google Cloud PubSub, AWS Kinesis, Microsoft Azure Storage, Alibaba Cloud OSS etc.

We tried our best to test against those systems in our continuous integration whenever possible. Some tests such as Prometheus, Kafka, and Ignite are done with live systems, meaning we install Prometheus/Kafka/Ignite on CI machine before the test is run. Some tests such as Kinesis, PubSub, and Azure Storage are done through official or non-official emulators. Offline tests are also performed whenever possible, though systems covered through offine tests may not have the same level of coverage as live systems or emulators.

Live System Emulator CI Integration Offline
Apache Kafka :heavy_check_mark: :heavy_check_mark:
Apache Ignite :heavy_check_mark: :heavy_check_mark:
Prometheus :heavy_check_mark: :heavy_check_mark:
Google PubSub :heavy_check_mark: :heavy_check_mark:
Azure Storage :heavy_check_mark: :heavy_check_mark:
AWS Kinesis :heavy_check_mark: :heavy_check_mark:
Alibaba Cloud OSS :heavy_check_mark:
Google BigTable/BigQuery to be added
Elasticsearch (experimental) :heavy_check_mark: :heavy_check_mark:
MongoDB (experimental) :heavy_check_mark: :heavy_check_mark:

References for emulators:

Community

Additional Information

License

Apache License 2.0

Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

tensorflow_io_nightly-0.18.0.dev20210402165709-cp39-cp39-win_amd64.whl (21.3 MB view details)

Uploaded CPython 3.9 Windows x86-64

tensorflow_io_nightly-0.18.0.dev20210402165709-cp39-cp39-manylinux2010_x86_64.whl (25.4 MB view details)

Uploaded CPython 3.9 manylinux: glibc 2.12+ x86-64

tensorflow_io_nightly-0.18.0.dev20210402165709-cp39-cp39-macosx_10_14_x86_64.whl (21.5 MB view details)

Uploaded CPython 3.9 macOS 10.14+ x86-64

tensorflow_io_nightly-0.18.0.dev20210402165709-cp38-cp38-win_amd64.whl (21.3 MB view details)

Uploaded CPython 3.8 Windows x86-64

tensorflow_io_nightly-0.18.0.dev20210402165709-cp38-cp38-manylinux2010_x86_64.whl (25.4 MB view details)

Uploaded CPython 3.8 manylinux: glibc 2.12+ x86-64

tensorflow_io_nightly-0.18.0.dev20210402165709-cp38-cp38-macosx_10_14_x86_64.whl (21.5 MB view details)

Uploaded CPython 3.8 macOS 10.14+ x86-64

tensorflow_io_nightly-0.18.0.dev20210402165709-cp37-cp37m-win_amd64.whl (21.3 MB view details)

Uploaded CPython 3.7m Windows x86-64

tensorflow_io_nightly-0.18.0.dev20210402165709-cp37-cp37m-manylinux2010_x86_64.whl (25.4 MB view details)

Uploaded CPython 3.7m manylinux: glibc 2.12+ x86-64

tensorflow_io_nightly-0.18.0.dev20210402165709-cp37-cp37m-macosx_10_14_x86_64.whl (21.5 MB view details)

Uploaded CPython 3.7m macOS 10.14+ x86-64

tensorflow_io_nightly-0.18.0.dev20210402165709-cp36-cp36m-win_amd64.whl (21.3 MB view details)

Uploaded CPython 3.6m Windows x86-64

tensorflow_io_nightly-0.18.0.dev20210402165709-cp36-cp36m-manylinux2010_x86_64.whl (25.4 MB view details)

Uploaded CPython 3.6m manylinux: glibc 2.12+ x86-64

tensorflow_io_nightly-0.18.0.dev20210402165709-cp36-cp36m-macosx_10_14_x86_64.whl (21.5 MB view details)

Uploaded CPython 3.6m macOS 10.14+ x86-64

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210402165709-cp39-cp39-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210402165709-cp39-cp39-win_amd64.whl
Algorithm Hash digest
SHA256 af6a2276def1f1812a1720f70fa09799dd9e9b4607bb9fbdfc196953acaaefa8
MD5 e1084d4ecb78b1f28ffdc384c78acdbf
BLAKE2b-256 7e73da3022325ec969b320e71cbd71b3f32967e4c0e32e548dd3328f5f77c291

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210402165709-cp39-cp39-manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210402165709-cp39-cp39-manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 952387645b00044da353fa17bf58ff4bf710c075ecb603c4733a533107c19c7b
MD5 de547c4a012e1a82baad547d09d334b2
BLAKE2b-256 3268b980502abb0be8a3061e8af37e1ccca8c99c65c0c813fafcf14170184a5a

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210402165709-cp39-cp39-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210402165709-cp39-cp39-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 20dc3612e757ea1e8155ab3602e2dd99b550f245de4db103adf160076ed3d448
MD5 ebb5b1d16217669b907d37cfed869a1b
BLAKE2b-256 d2cdb79f2d1f5f3fd83055e3ec677335486aac80161521642c8f933a50f246f1

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210402165709-cp38-cp38-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210402165709-cp38-cp38-win_amd64.whl
Algorithm Hash digest
SHA256 8acdc7efadd69463b0b0fffe48e6f0df4572e9baab48ce20cfcaaaa5de1f2a27
MD5 3ee97ce2dd085c49dbb3e24e165d3ffa
BLAKE2b-256 a73abe7a50e9dfc1137126a99d6ae8f4e2b5d8fe53a5de33108d02fa657ffd79

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210402165709-cp38-cp38-manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210402165709-cp38-cp38-manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 d1b823caeaeb0342457c45fbe0d067a34ebd2e80000a4c665f47baf3a5ea8402
MD5 53fe68db265e2aec6f20b82363112f91
BLAKE2b-256 68347d8d3d58dcfd6966a556ce3ab0491061da59c9a8afafe93fcace6868d5ef

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210402165709-cp38-cp38-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210402165709-cp38-cp38-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 2243315714564c3a7e8f8b7894b2d1db2291d5905172fbd77b1ffe7e0da5f267
MD5 cf9e6d510264c02ef1a82f52658f56f7
BLAKE2b-256 55a022cd7d149548ec2566037da7e5bf436e51088bb41edea114e65f6ce71aee

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210402165709-cp37-cp37m-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210402165709-cp37-cp37m-win_amd64.whl
Algorithm Hash digest
SHA256 361a3c002d8576d8364fae9d6ff97181cec8cfae2b4e8435eedaddd28cfb13a0
MD5 744b691e016e9b479f216c033c3639d6
BLAKE2b-256 e12507d4782d5eeb2770d5f7e4aee9b74326a2e1791ebf7fcee2e85c964d02f7

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210402165709-cp37-cp37m-manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210402165709-cp37-cp37m-manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 61c67c36039bf452ef7ee5010d577b2f591bb8201b0b640bbbf954df4ba805e2
MD5 4728da47cbb2d2be3eb62582db6a384f
BLAKE2b-256 5843ca944958d5176832d9a3c76376075de59986fee74908f6984cea351e87f5

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210402165709-cp37-cp37m-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210402165709-cp37-cp37m-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 fda84c3d9329ba3434980dc67184b7fec339501cdc88e132dc2a40a56128e557
MD5 234be7aa56298b30ae2315f5ccfd0bed
BLAKE2b-256 e23f39dac0a7dcb9cec4e9ece2ea670258797810e92b5a389a28241b566aa555

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210402165709-cp36-cp36m-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210402165709-cp36-cp36m-win_amd64.whl
Algorithm Hash digest
SHA256 ea8d523fbb983cb3aee92d7dd3644d9d109cf2ae48cbb27e19b0149262b0b8d2
MD5 c39b7e205e0c9c324400496dac9d77b1
BLAKE2b-256 4c87c27c45323d99dc5698c0071565c5f21569c213e2bace1b5d6eb1c95d86a8

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210402165709-cp36-cp36m-manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210402165709-cp36-cp36m-manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 4f910481f0672fb70bd79acb467ebd5919709d0faeeaacca12b225cc7647ee38
MD5 9d5e34a5a3345eee7fed87d0a33c42ec
BLAKE2b-256 daad55522cec4bf1c2b44cc470f2716e497aac471dfc72fee85465fb9e13a44e

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210402165709-cp36-cp36m-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210402165709-cp36-cp36m-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 a1a3cd8415f289c51e0e4ffea4424d2708db78a3e1e6b3f5342084951c5fb620
MD5 b81f4ac03e854c2e9f463fd84c2b9285
BLAKE2b-256 b178e98fa77c66e2b5d97db96783ab1b9e20af2f84b7c6be2698aef784fab01a

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page