Skip to main content

TensorFlow IO

Project description




TensorFlow I/O

GitHub CI PyPI License Documentation

TensorFlow I/O is a collection of file systems and file formats that are not available in TensorFlow's built-in support. A full list of supported file systems and file formats by TensorFlow I/O can be found here.

The use of tensorflow-io is straightforward with keras. Below is an example to Get Started with TensorFlow with the data processing aspect replaced by tensorflow-io:

import tensorflow as tf
import tensorflow_io as tfio

# Read the MNIST data into the IODataset.
dataset_url = "https://storage.googleapis.com/cvdf-datasets/mnist/"
d_train = tfio.IODataset.from_mnist(
    dataset_url + "train-images-idx3-ubyte.gz",
    dataset_url + "train-labels-idx1-ubyte.gz",
)

# Shuffle the elements of the dataset.
d_train = d_train.shuffle(buffer_size=1024)

# By default image data is uint8, so convert to float32 using map().
d_train = d_train.map(lambda x, y: (tf.image.convert_image_dtype(x, tf.float32), y))

# prepare batches the data just like any other tf.data.Dataset
d_train = d_train.batch(32)

# Build the model.
model = tf.keras.models.Sequential(
    [
        tf.keras.layers.Flatten(input_shape=(28, 28)),
        tf.keras.layers.Dense(512, activation=tf.nn.relu),
        tf.keras.layers.Dropout(0.2),
        tf.keras.layers.Dense(10, activation=tf.nn.softmax),
    ]
)

# Compile the model.
model.compile(
    optimizer="adam", loss="sparse_categorical_crossentropy", metrics=["accuracy"]
)

# Fit the model.
model.fit(d_train, epochs=5, steps_per_epoch=200)

In the above MNIST example, the URL's to access the dataset files are passed directly to the tfio.IODataset.from_mnist API call. This is due to the inherent support that tensorflow-io provides for HTTP/HTTPS file system, thus eliminating the need for downloading and saving datasets on a local directory.

NOTE: Since tensorflow-io is able to detect and uncompress the MNIST dataset automatically if needed, we can pass the URL's for the compressed files (gzip) to the API call as is.

Please check the official documentation for more detailed and interesting usages of the package.

Installation

Python Package

The tensorflow-io Python package can be installed with pip directly using:

$ pip install tensorflow-io

People who are a little more adventurous can also try our nightly binaries:

$ pip install tensorflow-io-nightly

To ensure you have a version of TensorFlow that is compatible with TensorFlow-IO, you can specify the tensorflow extra requirement during install:

pip install tensorflow-io[tensorflow]

Similar extras exist for the tensorflow-gpu, tensorflow-cpu and tensorflow-rocm packages.

Docker Images

In addition to the pip packages, the docker images can be used to quickly get started.

For stable builds:

$ docker pull tfsigio/tfio:latest
$ docker run -it --rm --name tfio-latest tfsigio/tfio:latest

For nightly builds:

$ docker pull tfsigio/tfio:nightly
$ docker run -it --rm --name tfio-nightly tfsigio/tfio:nightly

R Package

Once the tensorflow-io Python package has been successfully installed, you can install the development version of the R package from GitHub via the following:

if (!require("remotes")) install.packages("remotes")
remotes::install_github("tensorflow/io", subdir = "R-package")

TensorFlow Version Compatibility

To ensure compatibility with TensorFlow, it is recommended to install a matching version of TensorFlow I/O according to the table below. You can find the list of releases here.

TensorFlow I/O Version TensorFlow Compatibility Release Date
0.25.0 2.8.x Apr 19, 2022
0.24.0 2.8.x Feb 04, 2022
0.23.1 2.7.x Dec 15, 2021
0.23.0 2.7.x Dec 14, 2021
0.22.0 2.7.x Nov 10, 2021
0.21.0 2.6.x Sep 12, 2021
0.20.0 2.6.x Aug 11, 2021
0.19.1 2.5.x Jul 25, 2021
0.19.0 2.5.x Jun 25, 2021
0.18.0 2.5.x May 13, 2021
0.17.1 2.4.x Apr 16, 2021
0.17.0 2.4.x Dec 14, 2020
0.16.0 2.3.x Oct 23, 2020
0.15.0 2.3.x Aug 03, 2020
0.14.0 2.2.x Jul 08, 2020
0.13.0 2.2.x May 10, 2020
0.12.0 2.1.x Feb 28, 2020
0.11.0 2.1.x Jan 10, 2020
0.10.0 2.0.x Dec 05, 2019
0.9.1 2.0.x Nov 15, 2019
0.9.0 2.0.x Oct 18, 2019
0.8.1 1.15.x Nov 15, 2019
0.8.0 1.15.x Oct 17, 2019
0.7.2 1.14.x Nov 15, 2019
0.7.1 1.14.x Oct 18, 2019
0.7.0 1.14.x Jul 14, 2019
0.6.0 1.13.x May 29, 2019
0.5.0 1.13.x Apr 12, 2019
0.4.0 1.13.x Mar 01, 2019
0.3.0 1.12.0 Feb 15, 2019
0.2.0 1.12.0 Jan 29, 2019
0.1.0 1.12.0 Dec 16, 2018

Performance Benchmarking

We use github-pages to document the results of API performance benchmarks. The benchmark job is triggered on every commit to master branch and facilitates tracking performance w.r.t commits.

Contributing

Tensorflow I/O is a community led open source project. As such, the project depends on public contributions, bug-fixes, and documentation. Please see:

Build Status and CI

Build Status
Linux CPU Python 2 Status
Linux CPU Python 3 Status
Linux GPU Python 2 Status
Linux GPU Python 3 Status

Because of manylinux2010 requirement, TensorFlow I/O is built with Ubuntu:16.04 + Developer Toolset 7 (GCC 7.3) on Linux. Configuration with Ubuntu 16.04 with Developer Toolset 7 is not exactly straightforward. If the system have docker installed, then the following command will automatically build manylinux2010 compatible whl package:

#!/usr/bin/env bash

ls dist/*
for f in dist/*.whl; do
  docker run -i --rm -v $PWD:/v -w /v --net=host quay.io/pypa/manylinux2010_x86_64 bash -x -e /v/tools/build/auditwheel repair --plat manylinux2010_x86_64 $f
done
sudo chown -R $(id -nu):$(id -ng) .
ls wheelhouse/*

It takes some time to build, but once complete, there will be python 3.5, 3.6, 3.7 compatible whl packages available in wheelhouse directory.

On macOS, the same command could be used. However, the script expects python in shell and will only generate a whl package that matches the version of python in shell. If you want to build a whl package for a specific python then you have to alias this version of python to python in shell. See .github/workflows/build.yml Auditwheel step for instructions how to do that.

Note the above command is also the command we use when releasing packages for Linux and macOS.

TensorFlow I/O uses both GitHub Workflows and Google CI (Kokoro) for continuous integration. GitHub Workflows is used for macOS build and test. Kokoro is used for Linux build and test. Again, because of the manylinux2010 requirement, on Linux whl packages are always built with Ubuntu 16.04 + Developer Toolset 7. Tests are done on a variatiy of systems with different python3 versions to ensure a good coverage:

Python Ubuntu 18.04 Ubuntu 20.04 macOS + osx9 Windows-2019
2.7 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: N/A
3.7 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: :heavy_check_mark:
3.8 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: :heavy_check_mark:

TensorFlow I/O has integrations with many systems and cloud vendors such as Prometheus, Apache Kafka, Apache Ignite, Google Cloud PubSub, AWS Kinesis, Microsoft Azure Storage, Alibaba Cloud OSS etc.

We tried our best to test against those systems in our continuous integration whenever possible. Some tests such as Prometheus, Kafka, and Ignite are done with live systems, meaning we install Prometheus/Kafka/Ignite on CI machine before the test is run. Some tests such as Kinesis, PubSub, and Azure Storage are done through official or non-official emulators. Offline tests are also performed whenever possible, though systems covered through offine tests may not have the same level of coverage as live systems or emulators.

Live System Emulator CI Integration Offline
Apache Kafka :heavy_check_mark: :heavy_check_mark:
Apache Ignite :heavy_check_mark: :heavy_check_mark:
Prometheus :heavy_check_mark: :heavy_check_mark:
Google PubSub :heavy_check_mark: :heavy_check_mark:
Azure Storage :heavy_check_mark: :heavy_check_mark:
AWS Kinesis :heavy_check_mark: :heavy_check_mark:
Alibaba Cloud OSS :heavy_check_mark:
Google BigTable/BigQuery to be added
Elasticsearch (experimental) :heavy_check_mark: :heavy_check_mark:
MongoDB (experimental) :heavy_check_mark: :heavy_check_mark:

References for emulators:

Community

Additional Information

License

Apache License 2.0

Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

tensorflow_io_nightly-0.25.0.dev20220512204216-cp310-cp310-win_amd64.whl (22.5 MB view details)

Uploaded CPython 3.10 Windows x86-64

tensorflow_io_nightly-0.25.0.dev20220512204216-cp310-cp310-macosx_10_14_x86_64.whl (24.6 MB view details)

Uploaded CPython 3.10 macOS 10.14+ x86-64

tensorflow_io_nightly-0.25.0.dev20220512204216-cp39-cp39-win_amd64.whl (22.5 MB view details)

Uploaded CPython 3.9 Windows x86-64

tensorflow_io_nightly-0.25.0.dev20220512204216-cp39-cp39-macosx_10_14_x86_64.whl (24.6 MB view details)

Uploaded CPython 3.9 macOS 10.14+ x86-64

tensorflow_io_nightly-0.25.0.dev20220512204216-cp38-cp38-win_amd64.whl (22.5 MB view details)

Uploaded CPython 3.8 Windows x86-64

tensorflow_io_nightly-0.25.0.dev20220512204216-cp38-cp38-macosx_10_14_x86_64.whl (24.6 MB view details)

Uploaded CPython 3.8 macOS 10.14+ x86-64

tensorflow_io_nightly-0.25.0.dev20220512204216-cp37-cp37m-win_amd64.whl (22.5 MB view details)

Uploaded CPython 3.7m Windows x86-64

tensorflow_io_nightly-0.25.0.dev20220512204216-cp37-cp37m-macosx_10_14_x86_64.whl (24.6 MB view details)

Uploaded CPython 3.7m macOS 10.14+ x86-64

File details

Details for the file tensorflow_io_nightly-0.25.0.dev20220512204216-cp310-cp310-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.25.0.dev20220512204216-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 da3b17f0c52f74a647c874217a3f0a42f324c8267893aa17ed62ce72ce09df0d
MD5 45dd6d83777b234c2908e9c00fd81864
BLAKE2b-256 bf15b8bfd9215141b758f6fa8152657ef3df93c039269096f9976ae98c766301

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.25.0.dev20220512204216-cp310-cp310-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.25.0.dev20220512204216-cp310-cp310-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 faf96ea7f3f4153b9da6341d555745f144416982eb161ca8529c903b45bbc1cb
MD5 680615812e661ab89da082ec3d4ead04
BLAKE2b-256 175eb17b606fce7f5bad6875c9017fb1e8fa76c0888f579c73d38b112a31afde

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.25.0.dev20220512204216-cp310-cp310-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.25.0.dev20220512204216-cp310-cp310-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 984166ab9296f56e9fbc1e6a32f44c131550e4a1710e78dfe56ffa6449bfcbf3
MD5 1a92a22cc79f9fc9ee601cfec1c6e91c
BLAKE2b-256 8e200243f8c8017736ae485a2c71b58d754193f3e9b422fbbefb2ffdc038b990

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.25.0.dev20220512204216-cp39-cp39-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.25.0.dev20220512204216-cp39-cp39-win_amd64.whl
Algorithm Hash digest
SHA256 ca2a940e0c2acab3ba29aaa65761c033cb3debd3e20939415d04ce2ef5506795
MD5 224c7f31b6f857ba88137841eb8f53b9
BLAKE2b-256 c5ad5f1b51d55ed8789490ef0055ed9c6051b8e20e5143368ee156277b680074

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.25.0.dev20220512204216-cp39-cp39-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.25.0.dev20220512204216-cp39-cp39-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 7dc90ac63e118342475c47b2b6f90d4ff376745a3f1b6241c244e1f2e39728e0
MD5 5d978b93a24ef064266ae53495077d25
BLAKE2b-256 dacd7f5848aa10ab468778e01cd9969b2c61795bf7805c96284486ec47e71c43

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.25.0.dev20220512204216-cp39-cp39-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.25.0.dev20220512204216-cp39-cp39-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 a573021d1f4242f8d670a49a6770fbd2a68429a675e04eaa91aab71a00db2555
MD5 3c6c67ace0b86b14c44bedae3cb1e01f
BLAKE2b-256 63f3df98fd475b530ec83567039089c4e127506494a6374b566ab2830791a808

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.25.0.dev20220512204216-cp38-cp38-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.25.0.dev20220512204216-cp38-cp38-win_amd64.whl
Algorithm Hash digest
SHA256 8d8cb83251a9748a24ff773ef0c72f691ecbefea767bf4d74bc40a21c94b2aa3
MD5 34fefef95b18c5257e918d91fda9c95a
BLAKE2b-256 757a8eabad21d4bd0deec69a17e6c7808fceb8175c0f9c28d5c26fa1f6d645c1

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.25.0.dev20220512204216-cp38-cp38-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.25.0.dev20220512204216-cp38-cp38-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 f4f3e5942956611b979f99a61d61693d1e18cfdf5ee86a76e39741597efd30e7
MD5 335f550efdccec89a45f17f59227af0e
BLAKE2b-256 c1c1dc6cc8fc9b61a7075eebfc4d8d748dadd3d473fb1ea0abe43ac9bb9350ca

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.25.0.dev20220512204216-cp38-cp38-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.25.0.dev20220512204216-cp38-cp38-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 fe10f9391549c56ca39d44acdf39911a4a36d11599a3e215585033c478720878
MD5 56d6805b573c72891cfc5fa1ce473b7e
BLAKE2b-256 3adaf4fc34041568397379c44aee3e96168f38d5e568095ef8a59f8f4e292c2b

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.25.0.dev20220512204216-cp37-cp37m-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.25.0.dev20220512204216-cp37-cp37m-win_amd64.whl
Algorithm Hash digest
SHA256 2350da71d1e43318d35a6e33d6790d4b5227d3f4f6a69978a11b41e148182a9c
MD5 05b28c01ea631fe554e49f72c5330b99
BLAKE2b-256 496e615d536d5d09ba187b01bfe58fe7916fc8e64334bdc9e7932f5be96c9d17

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.25.0.dev20220512204216-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.25.0.dev20220512204216-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 6fde323356321093757fc7271de0c78eba1bc6c1538b39b5870c1e7d80bb1f20
MD5 90e1f70080bba7835d57a582b1d32a97
BLAKE2b-256 458ee34df2ab27bccba9c0a335444cb143ac167ea3eba1ff677d6aeb25aabdca

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.25.0.dev20220512204216-cp37-cp37m-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.25.0.dev20220512204216-cp37-cp37m-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 30047f30160fbbe1a70b0524bccc8845cfd3d0fe35b7b52a770dc1fd6ee13912
MD5 6b7a5aed83a864eeb6337c0d18983986
BLAKE2b-256 6a9d0907aaee0401eb10b326b977d25925f7732fe1ccb8b3a0fa841b69aca332

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page