Skip to main content

TensorFlow IO

Project description




TensorFlow I/O

GitHub CI PyPI License Documentation

TensorFlow I/O is a collection of file systems and file formats that are not available in TensorFlow's built-in support. A full list of supported file systems and file formats by TensorFlow I/O can be found here.

The use of tensorflow-io is straightforward with keras. Below is an example to Get Started with TensorFlow with the data processing aspect replaced by tensorflow-io:

import tensorflow as tf
import tensorflow_io as tfio

# Read the MNIST data into the IODataset.
dataset_url = "http://storage.googleapis.com/cvdf-datasets/mnist/"
d_train = tfio.IODataset.from_mnist(
    dataset_url + "train-images-idx3-ubyte.gz",
    dataset_url + "train-labels-idx1-ubyte.gz",
)

# Shuffle the elements of the dataset.
d_train = d_train.shuffle(buffer_size=1024)

# By default image data is uint8, so convert to float32 using map().
d_train = d_train.map(lambda x, y: (tf.image.convert_image_dtype(x, tf.float32), y))

# prepare batches the data just like any other tf.data.Dataset
d_train = d_train.batch(32)

# Build the model.
model = tf.keras.models.Sequential(
    [
        tf.keras.layers.Flatten(input_shape=(28, 28)),
        tf.keras.layers.Dense(512, activation=tf.nn.relu),
        tf.keras.layers.Dropout(0.2),
        tf.keras.layers.Dense(10, activation=tf.nn.softmax),
    ]
)

# Compile the model.
model.compile(
    optimizer="adam", loss="sparse_categorical_crossentropy", metrics=["accuracy"]
)

# Fit the model.
model.fit(d_train, epochs=5, steps_per_epoch=200)

In the above MNIST example, the URL's to access the dataset files are passed directly to the tfio.IODataset.from_mnist API call. This is due to the inherent support that tensorflow-io provides for the HTTP file system, thus eliminating the need for downloading and saving datasets on a local directory.

NOTE: Since tensorflow-io is able to detect and uncompress the MNIST dataset automatically if needed, we can pass the URL's for the compressed files (gzip) to the API call as is.

Please check the official documentation for more detailed and interesting usages of the package.

Installation

Python Package

The tensorflow-io Python package can be installed with pip directly using:

$ pip install tensorflow-io

People who are a little more adventurous can also try our nightly binaries:

$ pip install tensorflow-io-nightly

Docker Images

In addition to the pip packages, the docker images can be used to quickly get started.

For stable builds:

$ docker pull tfsigio/tfio:latest
$ docker run -it --rm --name tfio-latest tfsigio/tfio:latest

For nightly builds:

$ docker pull tfsigio/tfio:nightly
$ docker run -it --rm --name tfio-nightly tfsigio/tfio:nightly

R Package

Once the tensorflow-io Python package has been successfully installed, you can install the development version of the R package from GitHub via the following:

if (!require("remotes")) install.packages("remotes")
remotes::install_github("tensorflow/io", subdir = "R-package")

TensorFlow Version Compatibility

To ensure compatibility with TensorFlow, it is recommended to install a matching version of TensorFlow I/O according to the table below. You can find the list of releases here.

TensorFlow I/O Version TensorFlow Compatibility Release Date
0.17.0 2.4.x Dec 14, 2020
0.16.0 2.3.x Oct 23, 2020
0.15.0 2.3.x Aug 03, 2020
0.14.0 2.2.x Jul 08, 2020
0.13.0 2.2.x May 10, 2020
0.12.0 2.1.x Feb 28, 2020
0.11.0 2.1.x Jan 10, 2020
0.10.0 2.0.x Dec 05, 2019
0.9.1 2.0.x Nov 15, 2019
0.9.0 2.0.x Oct 18, 2019
0.8.1 1.15.x Nov 15, 2019
0.8.0 1.15.x Oct 17, 2019
0.7.2 1.14.x Nov 15, 2019
0.7.1 1.14.x Oct 18, 2019
0.7.0 1.14.x Jul 14, 2019
0.6.0 1.13.x May 29, 2019
0.5.0 1.13.x Apr 12, 2019
0.4.0 1.13.x Mar 01, 2019
0.3.0 1.12.0 Feb 15, 2019
0.2.0 1.12.0 Jan 29, 2019
0.1.0 1.12.0 Dec 16, 2018

Performance Benchmarking

We use github-pages to document the results of API performance benchmarks. The benchmark job is triggered on every commit to master branch and facilitates tracking performance w.r.t commits.

Contributing

Tensorflow I/O is a community led open source project. As such, the project depends on public contributions, bug-fixes, and documentation. Please see:

Build Status and CI

Build Status
Linux CPU Python 2 Status
Linux CPU Python 3 Status
Linux GPU Python 2 Status
Linux GPU Python 3 Status

Because of manylinux2010 requirement, TensorFlow I/O is built with Ubuntu:16.04 + Developer Toolset 7 (GCC 7.3) on Linux. Configuration with Ubuntu 16.04 with Developer Toolset 7 is not exactly straightforward. If the system have docker installed, then the following command will automatically build manylinux2010 compatible whl package:

#!/usr/bin/env bash

ls dist/*
for f in dist/*.whl; do
  docker run -i --rm -v $PWD:/v -w /v --net=host quay.io/pypa/manylinux2010_x86_64 bash -x -e /v/tools/build/auditwheel repair --plat manylinux2010_x86_64 $f
done
sudo chown -R $(id -nu):$(id -ng) .
ls wheelhouse/*

It takes some time to build, but once complete, there will be python 3.5, 3.6, 3.7 compatible whl packages available in wheelhouse directory.

On macOS, the same command could be used. However, the script expects python in shell and will only generate a whl package that matches the version of python in shell. If you want to build a whl package for a specific python then you have to alias this version of python to python in shell. See .github/workflows/build.yml Auditwheel step for instructions how to do that.

Note the above command is also the command we use when releasing packages for Linux and macOS.

TensorFlow I/O uses both GitHub Workflows and Google CI (Kokoro) for continuous integration. GitHub Workflows is used for macOS build and test. Kokoro is used for Linux build and test. Again, because of the manylinux2010 requirement, on Linux whl packages are always built with Ubuntu 16.04 + Developer Toolset 7. Tests are done on a variatiy of systems with different python3 versions to ensure a good coverage:

Python Ubuntu 18.04 Ubuntu 20.04 macOS + osx9 Windows-2019
2.7 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: N/A
3.7 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: :heavy_check_mark:
3.8 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: :heavy_check_mark:

TensorFlow I/O has integrations with many systems and cloud vendors such as Prometheus, Apache Kafka, Apache Ignite, Google Cloud PubSub, AWS Kinesis, Microsoft Azure Storage, Alibaba Cloud OSS etc.

We tried our best to test against those systems in our continuous integration whenever possible. Some tests such as Prometheus, Kafka, and Ignite are done with live systems, meaning we install Prometheus/Kafka/Ignite on CI machine before the test is run. Some tests such as Kinesis, PubSub, and Azure Storage are done through official or non-official emulators. Offline tests are also performed whenever possible, though systems covered through offine tests may not have the same level of coverage as live systems or emulators.

Live System Emulator CI Integration Offline
Apache Kafka :heavy_check_mark: :heavy_check_mark:
Apache Ignite :heavy_check_mark: :heavy_check_mark:
Prometheus :heavy_check_mark: :heavy_check_mark:
Google PubSub :heavy_check_mark: :heavy_check_mark:
Azure Storage :heavy_check_mark: :heavy_check_mark:
AWS Kinesis :heavy_check_mark: :heavy_check_mark:
Alibaba Cloud OSS :heavy_check_mark:
Google BigTable/BigQuery to be added
Elasticsearch (experimental) :heavy_check_mark: :heavy_check_mark:
MongoDB (experimental) :heavy_check_mark: :heavy_check_mark:

References for emulators:

Community

Additional Information

License

Apache License 2.0

Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

tensorflow_io_nightly-0.18.0.dev20210402133114-cp39-cp39-win_amd64.whl (21.3 MB view details)

Uploaded CPython 3.9 Windows x86-64

tensorflow_io_nightly-0.18.0.dev20210402133114-cp39-cp39-manylinux2010_x86_64.whl (25.4 MB view details)

Uploaded CPython 3.9 manylinux: glibc 2.12+ x86-64

tensorflow_io_nightly-0.18.0.dev20210402133114-cp39-cp39-macosx_10_14_x86_64.whl (21.5 MB view details)

Uploaded CPython 3.9 macOS 10.14+ x86-64

tensorflow_io_nightly-0.18.0.dev20210402133114-cp38-cp38-win_amd64.whl (21.3 MB view details)

Uploaded CPython 3.8 Windows x86-64

tensorflow_io_nightly-0.18.0.dev20210402133114-cp38-cp38-manylinux2010_x86_64.whl (25.4 MB view details)

Uploaded CPython 3.8 manylinux: glibc 2.12+ x86-64

tensorflow_io_nightly-0.18.0.dev20210402133114-cp38-cp38-macosx_10_14_x86_64.whl (21.5 MB view details)

Uploaded CPython 3.8 macOS 10.14+ x86-64

tensorflow_io_nightly-0.18.0.dev20210402133114-cp37-cp37m-win_amd64.whl (21.3 MB view details)

Uploaded CPython 3.7m Windows x86-64

tensorflow_io_nightly-0.18.0.dev20210402133114-cp37-cp37m-manylinux2010_x86_64.whl (25.4 MB view details)

Uploaded CPython 3.7m manylinux: glibc 2.12+ x86-64

tensorflow_io_nightly-0.18.0.dev20210402133114-cp37-cp37m-macosx_10_14_x86_64.whl (21.5 MB view details)

Uploaded CPython 3.7m macOS 10.14+ x86-64

tensorflow_io_nightly-0.18.0.dev20210402133114-cp36-cp36m-win_amd64.whl (21.3 MB view details)

Uploaded CPython 3.6m Windows x86-64

tensorflow_io_nightly-0.18.0.dev20210402133114-cp36-cp36m-manylinux2010_x86_64.whl (25.4 MB view details)

Uploaded CPython 3.6m manylinux: glibc 2.12+ x86-64

tensorflow_io_nightly-0.18.0.dev20210402133114-cp36-cp36m-macosx_10_14_x86_64.whl (21.5 MB view details)

Uploaded CPython 3.6m macOS 10.14+ x86-64

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210402133114-cp39-cp39-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210402133114-cp39-cp39-win_amd64.whl
Algorithm Hash digest
SHA256 a2969e5ddad280bf74d6b07e034d0efc0868a4e6c4f55e4e6eff46426870c53a
MD5 d7e7b4c242d9bc5113aa45aebe4b908c
BLAKE2b-256 f9df116e3a98de4fc00e28c2ab16d4697879aef03b41e827e23a6188b79c3b5b

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210402133114-cp39-cp39-manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210402133114-cp39-cp39-manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 d3847ef254f4f5f1a1b25120a33eec56eee5b98cb0c80421f9ed6c1cad5cd0e1
MD5 463838375562375c2edb9253ccfc50e6
BLAKE2b-256 371130f209ab133a82276c6d3fa9d512aa973743723904f426380dfd26a818cd

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210402133114-cp39-cp39-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210402133114-cp39-cp39-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 50792e7c34aab292d636d4f462a4d96be9366be0dd6c03c2cfce1a81ab1ad028
MD5 4f576fe735ca2ffa3636c0960569a6a3
BLAKE2b-256 928c6b8c45cc8e35adc6cb9312153bc4fbcc1997492cc46aa1f77fa1460fbad5

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210402133114-cp38-cp38-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210402133114-cp38-cp38-win_amd64.whl
Algorithm Hash digest
SHA256 8c49830594270c22ac4f00a827a9a0d7aca46a93702e4da7b1c36026b3a94547
MD5 eafac5d302a12e0ca012cff8d3ae8552
BLAKE2b-256 16602d3d688cdc78fee8ffe37db900d5fcd20ca5d0d43c121f942e90ecf62659

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210402133114-cp38-cp38-manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210402133114-cp38-cp38-manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 952b159710ea94929581d6aef1adde688c91a57c832e3bcee2613da8b49c7937
MD5 5d3b07b681ede1441052e6fb00169b13
BLAKE2b-256 0ee982b7086c7dc19dc916d544de5867b86e6820a2e7cb5da95fa61dba09ec56

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210402133114-cp38-cp38-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210402133114-cp38-cp38-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 4f994cd3417dca5b9154d6ee121d3e22d6ee130d41657889ffd8171a1a72d1a3
MD5 f5223b57331a66057ccf4fae2b47e2d5
BLAKE2b-256 e2da5669c527d89c811ef58a41da698e77eb71168a446918af5b1f0e922a1312

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210402133114-cp37-cp37m-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210402133114-cp37-cp37m-win_amd64.whl
Algorithm Hash digest
SHA256 6b1fcf7c107295b3d7049375cc7b90f622b468aebc3a45527da64c4023b9729a
MD5 ef44ff106f40072bb128df6b3e70c7b2
BLAKE2b-256 6e79ff4a75012808520fc10f507db7faff2e6a2a4995460e4c187c248762d4b9

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210402133114-cp37-cp37m-manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210402133114-cp37-cp37m-manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 f25a21b702e44b8245e947d6710926d59e95dbdd90e1ebd586f3cddfae2b22d2
MD5 8b0205dc0132c7bfd5b6378c11d6f018
BLAKE2b-256 d736a56584eb6a5ec9e85bbf92d93ddea2221a55cbef31c2512ea588270648d6

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210402133114-cp37-cp37m-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210402133114-cp37-cp37m-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 d6f052fe7f7cf256a9a7af3362f1520c3a3fca68a3741cd8fd8658d59c3ce503
MD5 e0d4f3020f22a35f381ae317936c6584
BLAKE2b-256 0b8a2e9866cf79d7c11b4a04b0900e818019b717c445477639a92c5368cf77a7

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210402133114-cp36-cp36m-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210402133114-cp36-cp36m-win_amd64.whl
Algorithm Hash digest
SHA256 4d9550663182ece17ffc6b963555f58e90074fae659824fba0407cf0513c3768
MD5 e4d803b1e4481845e8dec85eb7e409be
BLAKE2b-256 8766465e4cb64e5528e79284776c9d8c80a49b189f30d255db94a83ac009e6c0

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210402133114-cp36-cp36m-manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210402133114-cp36-cp36m-manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 f2394012b6ebb515067ad98d84a3414bf85170534c1c590fd869bde549433223
MD5 676f54b5736c8baa6698eabf8aa3a389
BLAKE2b-256 2fef48596c9d26dc92d75eafe4990e32d51872f26abda121edd3b44330c5f09e

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210402133114-cp36-cp36m-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210402133114-cp36-cp36m-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 c4d7b11c46188ee7a2c8cf1471a39a0e1a01749bb2ce113e4403c15357115f6f
MD5 c4dff26f5468362717bd23c78159112a
BLAKE2b-256 4350836c54ac8d0c4b484e73a8392af1597b4c4c8bf7f90da7faa53d6437b06c

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page