Skip to main content

TensorFlow IO

Project description




TensorFlow I/O

GitHub CI PyPI License Documentation

TensorFlow I/O is a collection of file systems and file formats that are not available in TensorFlow's built-in support. A full list of supported file systems and file formats by TensorFlow I/O can be found here.

The use of tensorflow-io is straightforward with keras. Below is an example to Get Started with TensorFlow with the data processing aspect replaced by tensorflow-io:

import tensorflow as tf
import tensorflow_io as tfio

# Read the MNIST data into the IODataset.
dataset_url = "https://storage.googleapis.com/cvdf-datasets/mnist/"
d_train = tfio.IODataset.from_mnist(
    dataset_url + "train-images-idx3-ubyte.gz",
    dataset_url + "train-labels-idx1-ubyte.gz",
)

# Shuffle the elements of the dataset.
d_train = d_train.shuffle(buffer_size=1024)

# By default image data is uint8, so convert to float32 using map().
d_train = d_train.map(lambda x, y: (tf.image.convert_image_dtype(x, tf.float32), y))

# prepare batches the data just like any other tf.data.Dataset
d_train = d_train.batch(32)

# Build the model.
model = tf.keras.models.Sequential(
    [
        tf.keras.layers.Flatten(input_shape=(28, 28)),
        tf.keras.layers.Dense(512, activation=tf.nn.relu),
        tf.keras.layers.Dropout(0.2),
        tf.keras.layers.Dense(10, activation=tf.nn.softmax),
    ]
)

# Compile the model.
model.compile(
    optimizer="adam", loss="sparse_categorical_crossentropy", metrics=["accuracy"]
)

# Fit the model.
model.fit(d_train, epochs=5, steps_per_epoch=200)

In the above MNIST example, the URL's to access the dataset files are passed directly to the tfio.IODataset.from_mnist API call. This is due to the inherent support that tensorflow-io provides for HTTP/HTTPS file system, thus eliminating the need for downloading and saving datasets on a local directory.

NOTE: Since tensorflow-io is able to detect and uncompress the MNIST dataset automatically if needed, we can pass the URL's for the compressed files (gzip) to the API call as is.

Please check the official documentation for more detailed and interesting usages of the package.

Installation

Python Package

The tensorflow-io Python package can be installed with pip directly using:

$ pip install tensorflow-io

People who are a little more adventurous can also try our nightly binaries:

$ pip install tensorflow-io-nightly

To ensure you have a version of TensorFlow that is compatible with TensorFlow-IO, you can specify the tensorflow extra requirement during install:

pip install tensorflow-io[tensorflow]

Similar extras exist for the tensorflow-gpu, tensorflow-cpu and tensorflow-rocm packages.

Docker Images

In addition to the pip packages, the docker images can be used to quickly get started.

For stable builds:

$ docker pull tfsigio/tfio:latest
$ docker run -it --rm --name tfio-latest tfsigio/tfio:latest

For nightly builds:

$ docker pull tfsigio/tfio:nightly
$ docker run -it --rm --name tfio-nightly tfsigio/tfio:nightly

R Package

Once the tensorflow-io Python package has been successfully installed, you can install the development version of the R package from GitHub via the following:

if (!require("remotes")) install.packages("remotes")
remotes::install_github("tensorflow/io", subdir = "R-package")

TensorFlow Version Compatibility

To ensure compatibility with TensorFlow, it is recommended to install a matching version of TensorFlow I/O according to the table below. You can find the list of releases here.

TensorFlow I/O Version TensorFlow Compatibility Release Date
0.23.1 2.7.x Dec 15, 2021
0.23.0 2.7.x Dec 14, 2021
0.22.0 2.7.x Nov 10, 2021
0.21.0 2.6.x Sep 12, 2021
0.20.0 2.6.x Aug 11, 2021
0.19.1 2.5.x Jul 25, 2021
0.19.0 2.5.x Jun 25, 2021
0.18.0 2.5.x May 13, 2021
0.17.1 2.4.x Apr 16, 2021
0.17.0 2.4.x Dec 14, 2020
0.16.0 2.3.x Oct 23, 2020
0.15.0 2.3.x Aug 03, 2020
0.14.0 2.2.x Jul 08, 2020
0.13.0 2.2.x May 10, 2020
0.12.0 2.1.x Feb 28, 2020
0.11.0 2.1.x Jan 10, 2020
0.10.0 2.0.x Dec 05, 2019
0.9.1 2.0.x Nov 15, 2019
0.9.0 2.0.x Oct 18, 2019
0.8.1 1.15.x Nov 15, 2019
0.8.0 1.15.x Oct 17, 2019
0.7.2 1.14.x Nov 15, 2019
0.7.1 1.14.x Oct 18, 2019
0.7.0 1.14.x Jul 14, 2019
0.6.0 1.13.x May 29, 2019
0.5.0 1.13.x Apr 12, 2019
0.4.0 1.13.x Mar 01, 2019
0.3.0 1.12.0 Feb 15, 2019
0.2.0 1.12.0 Jan 29, 2019
0.1.0 1.12.0 Dec 16, 2018

Performance Benchmarking

We use github-pages to document the results of API performance benchmarks. The benchmark job is triggered on every commit to master branch and facilitates tracking performance w.r.t commits.

Contributing

Tensorflow I/O is a community led open source project. As such, the project depends on public contributions, bug-fixes, and documentation. Please see:

Build Status and CI

Build Status
Linux CPU Python 2 Status
Linux CPU Python 3 Status
Linux GPU Python 2 Status
Linux GPU Python 3 Status

Because of manylinux2010 requirement, TensorFlow I/O is built with Ubuntu:16.04 + Developer Toolset 7 (GCC 7.3) on Linux. Configuration with Ubuntu 16.04 with Developer Toolset 7 is not exactly straightforward. If the system have docker installed, then the following command will automatically build manylinux2010 compatible whl package:

#!/usr/bin/env bash

ls dist/*
for f in dist/*.whl; do
  docker run -i --rm -v $PWD:/v -w /v --net=host quay.io/pypa/manylinux2010_x86_64 bash -x -e /v/tools/build/auditwheel repair --plat manylinux2010_x86_64 $f
done
sudo chown -R $(id -nu):$(id -ng) .
ls wheelhouse/*

It takes some time to build, but once complete, there will be python 3.5, 3.6, 3.7 compatible whl packages available in wheelhouse directory.

On macOS, the same command could be used. However, the script expects python in shell and will only generate a whl package that matches the version of python in shell. If you want to build a whl package for a specific python then you have to alias this version of python to python in shell. See .github/workflows/build.yml Auditwheel step for instructions how to do that.

Note the above command is also the command we use when releasing packages for Linux and macOS.

TensorFlow I/O uses both GitHub Workflows and Google CI (Kokoro) for continuous integration. GitHub Workflows is used for macOS build and test. Kokoro is used for Linux build and test. Again, because of the manylinux2010 requirement, on Linux whl packages are always built with Ubuntu 16.04 + Developer Toolset 7. Tests are done on a variatiy of systems with different python3 versions to ensure a good coverage:

Python Ubuntu 18.04 Ubuntu 20.04 macOS + osx9 Windows-2019
2.7 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: N/A
3.7 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: :heavy_check_mark:
3.8 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: :heavy_check_mark:

TensorFlow I/O has integrations with many systems and cloud vendors such as Prometheus, Apache Kafka, Apache Ignite, Google Cloud PubSub, AWS Kinesis, Microsoft Azure Storage, Alibaba Cloud OSS etc.

We tried our best to test against those systems in our continuous integration whenever possible. Some tests such as Prometheus, Kafka, and Ignite are done with live systems, meaning we install Prometheus/Kafka/Ignite on CI machine before the test is run. Some tests such as Kinesis, PubSub, and Azure Storage are done through official or non-official emulators. Offline tests are also performed whenever possible, though systems covered through offine tests may not have the same level of coverage as live systems or emulators.

Live System Emulator CI Integration Offline
Apache Kafka :heavy_check_mark: :heavy_check_mark:
Apache Ignite :heavy_check_mark: :heavy_check_mark:
Prometheus :heavy_check_mark: :heavy_check_mark:
Google PubSub :heavy_check_mark: :heavy_check_mark:
Azure Storage :heavy_check_mark: :heavy_check_mark:
AWS Kinesis :heavy_check_mark: :heavy_check_mark:
Alibaba Cloud OSS :heavy_check_mark:
Google BigTable/BigQuery to be added
Elasticsearch (experimental) :heavy_check_mark: :heavy_check_mark:
MongoDB (experimental) :heavy_check_mark: :heavy_check_mark:

References for emulators:

Community

Additional Information

License

Apache License 2.0

Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

tensorflow_io_nightly-0.23.1.dev20211216204107-cp310-cp310-win_amd64.whl (21.6 MB view details)

Uploaded CPython 3.10 Windows x86-64

tensorflow_io_nightly-0.23.1.dev20211216204107-cp310-cp310-macosx_10_14_x86_64.whl (23.8 MB view details)

Uploaded CPython 3.10 macOS 10.14+ x86-64

tensorflow_io_nightly-0.23.1.dev20211216204107-cp39-cp39-win_amd64.whl (21.6 MB view details)

Uploaded CPython 3.9 Windows x86-64

tensorflow_io_nightly-0.23.1.dev20211216204107-cp39-cp39-macosx_10_14_x86_64.whl (23.8 MB view details)

Uploaded CPython 3.9 macOS 10.14+ x86-64

tensorflow_io_nightly-0.23.1.dev20211216204107-cp38-cp38-win_amd64.whl (21.6 MB view details)

Uploaded CPython 3.8 Windows x86-64

tensorflow_io_nightly-0.23.1.dev20211216204107-cp38-cp38-macosx_10_14_x86_64.whl (23.8 MB view details)

Uploaded CPython 3.8 macOS 10.14+ x86-64

tensorflow_io_nightly-0.23.1.dev20211216204107-cp37-cp37m-win_amd64.whl (21.6 MB view details)

Uploaded CPython 3.7m Windows x86-64

tensorflow_io_nightly-0.23.1.dev20211216204107-cp37-cp37m-macosx_10_14_x86_64.whl (23.8 MB view details)

Uploaded CPython 3.7m macOS 10.14+ x86-64

File details

Details for the file tensorflow_io_nightly-0.23.1.dev20211216204107-cp310-cp310-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.23.1.dev20211216204107-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 03ddc71ed8cd4ac606d3ea5577249448f18f573e232981c819cc4ed5241a756b
MD5 117d694b0db4253d2ddb2b5b1c43e9da
BLAKE2b-256 088efa796e827aea76a14b5dd73ff42b66e2cc9b0703b6be50aed8c7c9f99beb

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.23.1.dev20211216204107-cp310-cp310-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.23.1.dev20211216204107-cp310-cp310-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 a4b24c5568c3d7cc764fd892d701d0f59c4ec8a8a9079f94b133013b9a649785
MD5 5c61087dcbfdd75dc7cbeb8bdf3c9419
BLAKE2b-256 50017f1805887aeea64da850ef63ad3145f3e13218d34818d38416a466aae7b1

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.23.1.dev20211216204107-cp310-cp310-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.23.1.dev20211216204107-cp310-cp310-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 946d1ae4ffd231ec82aff8c20e7b9e3dd80771eca4e67edd74d9131dd034bec5
MD5 6d1375f4124c3cbc3f41a139ada53ec8
BLAKE2b-256 49c80e47d27e5f14f140260eede219d9f30e67606d86afd0f8a1bf057337e707

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.23.1.dev20211216204107-cp39-cp39-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.23.1.dev20211216204107-cp39-cp39-win_amd64.whl
Algorithm Hash digest
SHA256 ced7c5e69f6e987c9f5067a717ea41662c351cc5291f638a990f3d3a40659e3c
MD5 0cf949ab413c86fda4d2614432057e03
BLAKE2b-256 6b1fc872715826bedae35a1899d1586184b985a5eced2e67aac3e95d7f39ad0a

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.23.1.dev20211216204107-cp39-cp39-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.23.1.dev20211216204107-cp39-cp39-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 f082d0f70d06110c661805a4516f1f0c039dae0a7959fe996877039dd6fdb5c3
MD5 ee5937eafa67dfa8b4a5c62b0452bced
BLAKE2b-256 fd8f23a3c68b2f72eeba211a50e7c28421f5e2341cf7c695415ee31a2ea00dfb

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.23.1.dev20211216204107-cp39-cp39-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.23.1.dev20211216204107-cp39-cp39-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 28153889a4047bc49e8cb0deef94b9fa127ffdbf3a5e01a3c33592282c14e4e8
MD5 7e7ff07bf446ae5b8ff3fbca05cf8fd0
BLAKE2b-256 4a7adf764833ce383cee02fde1643680e48153b4ef58ab0d34e7ba8ad48a6479

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.23.1.dev20211216204107-cp38-cp38-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.23.1.dev20211216204107-cp38-cp38-win_amd64.whl
Algorithm Hash digest
SHA256 61c5af1cbfaf6baee9b37968bdeed952e76b153437fd778d0711dfe5d5246869
MD5 1fedfcd66292a2d6067ffed5b8d12684
BLAKE2b-256 1e034a878e7c380a9f4e3732a8e44c93f4ea6b70bd223376b79f70c646041b49

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.23.1.dev20211216204107-cp38-cp38-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.23.1.dev20211216204107-cp38-cp38-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 421361a538e64f9026228d41994b333266bd0b1a8fa6cd616b2eafed66004e97
MD5 eb76a9771a0052e186fb8321110ce82b
BLAKE2b-256 afadab1de1e23b5de756677e62b864dd3f617f90fb9569a3fab4e9139a7621c0

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.23.1.dev20211216204107-cp38-cp38-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.23.1.dev20211216204107-cp38-cp38-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 378e297f1b21d04b03ec34a88dd220c5e23a6f526bd669e372c4cd8aada1b1e2
MD5 9199be8f0473cafa5c3b8a2a61e9356f
BLAKE2b-256 8f21bcad7b5bd5386eeb3f81f4dd78355bd9d67aa95466a3059196414a2d638c

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.23.1.dev20211216204107-cp37-cp37m-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.23.1.dev20211216204107-cp37-cp37m-win_amd64.whl
Algorithm Hash digest
SHA256 e41d9b2fab8169714c3d14b32fbe4d412fe3fd3ab67c37ca2bab79b53cde1a50
MD5 86704645153d811b2d1a80df320cb4ef
BLAKE2b-256 278fdbbe499deb858cd307e4567cac19c7f2b2864f2ecd08dcb66f0a5ef6ca13

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.23.1.dev20211216204107-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.23.1.dev20211216204107-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 ecc36303763575b35ec1410a49aa3d2ae6db52a69546495cac525ae9aa50a120
MD5 042f5f0dcb1d21ae67d3eee6579ac87d
BLAKE2b-256 47e429a8ddff0546add4adc8b0fb485934d1997a4b29860ffce5a38f02d1cb74

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.23.1.dev20211216204107-cp37-cp37m-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.23.1.dev20211216204107-cp37-cp37m-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 c17dfd9e8987498aa49797967d5c906a4443c2cfa2270a00215795c35306c0d6
MD5 0f573ba6804b45a6e499e345f09e054f
BLAKE2b-256 8fc5551fce76b7ead8171483669bdce1e67f31f36ad3c3e984a91fc54c3d414d

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page