Skip to main content

TensorFlow IO

Project description




TensorFlow I/O

GitHub CI PyPI License Documentation

TensorFlow I/O is a collection of file systems and file formats that are not available in TensorFlow's built-in support. A full list of supported file systems and file formats by TensorFlow I/O can be found here.

The use of tensorflow-io is straightforward with keras. Below is an example to Get Started with TensorFlow with the data processing aspect replaced by tensorflow-io:

import tensorflow as tf
import tensorflow_io as tfio

# Read the MNIST data into the IODataset.
dataset_url = "https://storage.googleapis.com/cvdf-datasets/mnist/"
d_train = tfio.IODataset.from_mnist(
    dataset_url + "train-images-idx3-ubyte.gz",
    dataset_url + "train-labels-idx1-ubyte.gz",
)

# Shuffle the elements of the dataset.
d_train = d_train.shuffle(buffer_size=1024)

# By default image data is uint8, so convert to float32 using map().
d_train = d_train.map(lambda x, y: (tf.image.convert_image_dtype(x, tf.float32), y))

# prepare batches the data just like any other tf.data.Dataset
d_train = d_train.batch(32)

# Build the model.
model = tf.keras.models.Sequential(
    [
        tf.keras.layers.Flatten(input_shape=(28, 28)),
        tf.keras.layers.Dense(512, activation=tf.nn.relu),
        tf.keras.layers.Dropout(0.2),
        tf.keras.layers.Dense(10, activation=tf.nn.softmax),
    ]
)

# Compile the model.
model.compile(
    optimizer="adam", loss="sparse_categorical_crossentropy", metrics=["accuracy"]
)

# Fit the model.
model.fit(d_train, epochs=5, steps_per_epoch=200)

In the above MNIST example, the URL's to access the dataset files are passed directly to the tfio.IODataset.from_mnist API call. This is due to the inherent support that tensorflow-io provides for HTTP/HTTPS file system, thus eliminating the need for downloading and saving datasets on a local directory.

NOTE: Since tensorflow-io is able to detect and uncompress the MNIST dataset automatically if needed, we can pass the URL's for the compressed files (gzip) to the API call as is.

Please check the official documentation for more detailed and interesting usages of the package.

Installation

Python Package

The tensorflow-io Python package can be installed with pip directly using:

$ pip install tensorflow-io

People who are a little more adventurous can also try our nightly binaries:

$ pip install tensorflow-io-nightly

Docker Images

In addition to the pip packages, the docker images can be used to quickly get started.

For stable builds:

$ docker pull tfsigio/tfio:latest
$ docker run -it --rm --name tfio-latest tfsigio/tfio:latest

For nightly builds:

$ docker pull tfsigio/tfio:nightly
$ docker run -it --rm --name tfio-nightly tfsigio/tfio:nightly

R Package

Once the tensorflow-io Python package has been successfully installed, you can install the development version of the R package from GitHub via the following:

if (!require("remotes")) install.packages("remotes")
remotes::install_github("tensorflow/io", subdir = "R-package")

TensorFlow Version Compatibility

To ensure compatibility with TensorFlow, it is recommended to install a matching version of TensorFlow I/O according to the table below. You can find the list of releases here.

TensorFlow I/O Version TensorFlow Compatibility Release Date
0.18.0 2.5.x May 13, 2021
0.17.1 2.4.x Apr 16, 2021
0.17.0 2.4.x Dec 14, 2020
0.16.0 2.3.x Oct 23, 2020
0.15.0 2.3.x Aug 03, 2020
0.14.0 2.2.x Jul 08, 2020
0.13.0 2.2.x May 10, 2020
0.12.0 2.1.x Feb 28, 2020
0.11.0 2.1.x Jan 10, 2020
0.10.0 2.0.x Dec 05, 2019
0.9.1 2.0.x Nov 15, 2019
0.9.0 2.0.x Oct 18, 2019
0.8.1 1.15.x Nov 15, 2019
0.8.0 1.15.x Oct 17, 2019
0.7.2 1.14.x Nov 15, 2019
0.7.1 1.14.x Oct 18, 2019
0.7.0 1.14.x Jul 14, 2019
0.6.0 1.13.x May 29, 2019
0.5.0 1.13.x Apr 12, 2019
0.4.0 1.13.x Mar 01, 2019
0.3.0 1.12.0 Feb 15, 2019
0.2.0 1.12.0 Jan 29, 2019
0.1.0 1.12.0 Dec 16, 2018

Performance Benchmarking

We use github-pages to document the results of API performance benchmarks. The benchmark job is triggered on every commit to master branch and facilitates tracking performance w.r.t commits.

Contributing

Tensorflow I/O is a community led open source project. As such, the project depends on public contributions, bug-fixes, and documentation. Please see:

Build Status and CI

Build Status
Linux CPU Python 2 Status
Linux CPU Python 3 Status
Linux GPU Python 2 Status
Linux GPU Python 3 Status

Because of manylinux2010 requirement, TensorFlow I/O is built with Ubuntu:16.04 + Developer Toolset 7 (GCC 7.3) on Linux. Configuration with Ubuntu 16.04 with Developer Toolset 7 is not exactly straightforward. If the system have docker installed, then the following command will automatically build manylinux2010 compatible whl package:

#!/usr/bin/env bash

ls dist/*
for f in dist/*.whl; do
  docker run -i --rm -v $PWD:/v -w /v --net=host quay.io/pypa/manylinux2010_x86_64 bash -x -e /v/tools/build/auditwheel repair --plat manylinux2010_x86_64 $f
done
sudo chown -R $(id -nu):$(id -ng) .
ls wheelhouse/*

It takes some time to build, but once complete, there will be python 3.5, 3.6, 3.7 compatible whl packages available in wheelhouse directory.

On macOS, the same command could be used. However, the script expects python in shell and will only generate a whl package that matches the version of python in shell. If you want to build a whl package for a specific python then you have to alias this version of python to python in shell. See .github/workflows/build.yml Auditwheel step for instructions how to do that.

Note the above command is also the command we use when releasing packages for Linux and macOS.

TensorFlow I/O uses both GitHub Workflows and Google CI (Kokoro) for continuous integration. GitHub Workflows is used for macOS build and test. Kokoro is used for Linux build and test. Again, because of the manylinux2010 requirement, on Linux whl packages are always built with Ubuntu 16.04 + Developer Toolset 7. Tests are done on a variatiy of systems with different python3 versions to ensure a good coverage:

Python Ubuntu 18.04 Ubuntu 20.04 macOS + osx9 Windows-2019
2.7 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: N/A
3.7 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: :heavy_check_mark:
3.8 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: :heavy_check_mark:

TensorFlow I/O has integrations with many systems and cloud vendors such as Prometheus, Apache Kafka, Apache Ignite, Google Cloud PubSub, AWS Kinesis, Microsoft Azure Storage, Alibaba Cloud OSS etc.

We tried our best to test against those systems in our continuous integration whenever possible. Some tests such as Prometheus, Kafka, and Ignite are done with live systems, meaning we install Prometheus/Kafka/Ignite on CI machine before the test is run. Some tests such as Kinesis, PubSub, and Azure Storage are done through official or non-official emulators. Offline tests are also performed whenever possible, though systems covered through offine tests may not have the same level of coverage as live systems or emulators.

Live System Emulator CI Integration Offline
Apache Kafka :heavy_check_mark: :heavy_check_mark:
Apache Ignite :heavy_check_mark: :heavy_check_mark:
Prometheus :heavy_check_mark: :heavy_check_mark:
Google PubSub :heavy_check_mark: :heavy_check_mark:
Azure Storage :heavy_check_mark: :heavy_check_mark:
AWS Kinesis :heavy_check_mark: :heavy_check_mark:
Alibaba Cloud OSS :heavy_check_mark:
Google BigTable/BigQuery to be added
Elasticsearch (experimental) :heavy_check_mark: :heavy_check_mark:
MongoDB (experimental) :heavy_check_mark: :heavy_check_mark:

References for emulators:

Community

Additional Information

License

Apache License 2.0

Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

tensorflow_io_nightly-0.18.0.dev20210517130651-cp39-cp39-win_amd64.whl (21.0 MB view details)

Uploaded CPython 3.9 Windows x86-64

tensorflow_io_nightly-0.18.0.dev20210517130651-cp39-cp39-macosx_10_14_x86_64.whl (22.8 MB view details)

Uploaded CPython 3.9 macOS 10.14+ x86-64

tensorflow_io_nightly-0.18.0.dev20210517130651-cp38-cp38-win_amd64.whl (21.0 MB view details)

Uploaded CPython 3.8 Windows x86-64

tensorflow_io_nightly-0.18.0.dev20210517130651-cp38-cp38-macosx_10_14_x86_64.whl (22.8 MB view details)

Uploaded CPython 3.8 macOS 10.14+ x86-64

tensorflow_io_nightly-0.18.0.dev20210517130651-cp37-cp37m-win_amd64.whl (21.0 MB view details)

Uploaded CPython 3.7m Windows x86-64

tensorflow_io_nightly-0.18.0.dev20210517130651-cp37-cp37m-macosx_10_14_x86_64.whl (22.8 MB view details)

Uploaded CPython 3.7m macOS 10.14+ x86-64

tensorflow_io_nightly-0.18.0.dev20210517130651-cp36-cp36m-win_amd64.whl (21.0 MB view details)

Uploaded CPython 3.6m Windows x86-64

tensorflow_io_nightly-0.18.0.dev20210517130651-cp36-cp36m-macosx_10_14_x86_64.whl (22.8 MB view details)

Uploaded CPython 3.6m macOS 10.14+ x86-64

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210517130651-cp39-cp39-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210517130651-cp39-cp39-win_amd64.whl
Algorithm Hash digest
SHA256 31e7768d5cffbc81d5cb44cb83001a146d939e8eb5fd2a469107c73e96b94449
MD5 f6f2d8897cb6eb9ebeed4e2131dc70b3
BLAKE2b-256 587ccb590cc6aca98f555b2291d18a3e726684fabac1ca5f3058800a081b6ce7

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210517130651-cp39-cp39-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210517130651-cp39-cp39-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 27d941ead9c11215cbd679b76527c37875abecc29dc8dbfa55da0a9707ea680e
MD5 54037e3ee8c9140b0e2c5da7c269887d
BLAKE2b-256 539cd642da576abdb7ae603c59a4a4c76280721da0c870fc4ce8d5dc45916372

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210517130651-cp39-cp39-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210517130651-cp39-cp39-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 66cde9da0d92adbab644c13b11b566d6179b1d258aa209f0a7fcdc937678a98f
MD5 808084378cbbae69f8954b09c966d4e0
BLAKE2b-256 284eb94bab50f31b8d9961a4035846a22a4313f202a1db89f93f81012725ed07

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210517130651-cp38-cp38-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210517130651-cp38-cp38-win_amd64.whl
Algorithm Hash digest
SHA256 dac6cb04629f897983071b9dd9fd39ae72eb624d8391a9f590ef8e8aa3f03d94
MD5 c22a8ff9723580133ee7aea9eeba7130
BLAKE2b-256 19759431b20ce4f9ab0efc4346cacdaf23ba171c964331e4378b0b4e724aeeda

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210517130651-cp38-cp38-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210517130651-cp38-cp38-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 c8d15457d60a4751cb20da35a7f67140feed0f79c05c117e76b775177d910756
MD5 a5ec14cc45466018c86dea7a7aa8c252
BLAKE2b-256 420069804c0d4bfc52feef7a01b89a21c137b452e011105f77869b60510f1754

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210517130651-cp38-cp38-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210517130651-cp38-cp38-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 98faf1274929b6b2e0bc5d3ab011b77257cd131adfff8da16aadfcc8b4ac37e9
MD5 4e6971bb952f9426b10dc56bded7e67c
BLAKE2b-256 ccc21958dc9a3252b981b88eb2ae567efc028771dcf24d4ec5e72cc6be555848

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210517130651-cp37-cp37m-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210517130651-cp37-cp37m-win_amd64.whl
Algorithm Hash digest
SHA256 2e63db18f9759143b3791dd1ed28373ba78d937eb423cf0e0de7e485e268fc5e
MD5 6c141594e991c385c1eb3e08e0f85df4
BLAKE2b-256 f427c156b06f47008cab842ac14ff964616c2e7f76264ad066f97b245199e27e

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210517130651-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210517130651-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 06e0716fb0d80859b16bf1fd0855837bb3040d52ea916fa59cc54785df8443f9
MD5 a9763e05c6116f55f4cc0d1f6ab5c95f
BLAKE2b-256 55b1904b0540c0737232ec54db8cce0be37aac306f40f3582e4f9041f7347744

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210517130651-cp37-cp37m-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210517130651-cp37-cp37m-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 2969605882078416d6e7774706d057703f7a7c06fd0ee264ad25a3ec42fc930f
MD5 4c6da2c68216150326e110b1c6fae084
BLAKE2b-256 0666d6e7ecb7ed05ca495eec3845881e9ec37c8e1146f741254c4596ff8ec82b

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210517130651-cp36-cp36m-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210517130651-cp36-cp36m-win_amd64.whl
Algorithm Hash digest
SHA256 bc743cd553aed4d35a4753e1ef59474b209904226ef40badbc97036749d09022
MD5 4976a36dc532fcc9e08532afb210a4be
BLAKE2b-256 ac16f7848aa1df6970cff5e052f348aab1d9da2576c941f9b3a9f6f3f29e9d84

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210517130651-cp36-cp36m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210517130651-cp36-cp36m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 9791b3efd482bd5315b45a88cd62299075d34acd67e739a5fabe4f9bd994ca47
MD5 3e76ab946a985561143e858a03e395c5
BLAKE2b-256 b67968019a42196319c06317c699bb08d417aadce0472d72db4673e1fdffa72c

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210517130651-cp36-cp36m-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210517130651-cp36-cp36m-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 6f14415c364eb540451319c018fcf05a8975afb3a1469e1d95d7c6ef7ce7fabd
MD5 4282af7acf814b724ad06b203db4dcde
BLAKE2b-256 01264cc5bb5c457c121ca34a1c804ba3d2f0cf647116cab423bdbbeff1ebf3a9

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page