Skip to main content

TensorFlow IO

Project description




TensorFlow I/O

GitHub CI PyPI License Documentation

TensorFlow I/O is a collection of file systems and file formats that are not available in TensorFlow's built-in support. A full list of supported file systems and file formats by TensorFlow I/O can be found here.

The use of tensorflow-io is straightforward with keras. Below is an example to Get Started with TensorFlow with the data processing aspect replaced by tensorflow-io:

import tensorflow as tf
import tensorflow_io as tfio

# Read the MNIST data into the IODataset.
dataset_url = "https://storage.googleapis.com/cvdf-datasets/mnist/"
d_train = tfio.IODataset.from_mnist(
    dataset_url + "train-images-idx3-ubyte.gz",
    dataset_url + "train-labels-idx1-ubyte.gz",
)

# Shuffle the elements of the dataset.
d_train = d_train.shuffle(buffer_size=1024)

# By default image data is uint8, so convert to float32 using map().
d_train = d_train.map(lambda x, y: (tf.image.convert_image_dtype(x, tf.float32), y))

# prepare batches the data just like any other tf.data.Dataset
d_train = d_train.batch(32)

# Build the model.
model = tf.keras.models.Sequential(
    [
        tf.keras.layers.Flatten(input_shape=(28, 28)),
        tf.keras.layers.Dense(512, activation=tf.nn.relu),
        tf.keras.layers.Dropout(0.2),
        tf.keras.layers.Dense(10, activation=tf.nn.softmax),
    ]
)

# Compile the model.
model.compile(
    optimizer="adam", loss="sparse_categorical_crossentropy", metrics=["accuracy"]
)

# Fit the model.
model.fit(d_train, epochs=5, steps_per_epoch=200)

In the above MNIST example, the URL's to access the dataset files are passed directly to the tfio.IODataset.from_mnist API call. This is due to the inherent support that tensorflow-io provides for HTTP/HTTPS file system, thus eliminating the need for downloading and saving datasets on a local directory.

NOTE: Since tensorflow-io is able to detect and uncompress the MNIST dataset automatically if needed, we can pass the URL's for the compressed files (gzip) to the API call as is.

Please check the official documentation for more detailed and interesting usages of the package.

Installation

Python Package

The tensorflow-io Python package can be installed with pip directly using:

$ pip install tensorflow-io

People who are a little more adventurous can also try our nightly binaries:

$ pip install tensorflow-io-nightly

Docker Images

In addition to the pip packages, the docker images can be used to quickly get started.

For stable builds:

$ docker pull tfsigio/tfio:latest
$ docker run -it --rm --name tfio-latest tfsigio/tfio:latest

For nightly builds:

$ docker pull tfsigio/tfio:nightly
$ docker run -it --rm --name tfio-nightly tfsigio/tfio:nightly

R Package

Once the tensorflow-io Python package has been successfully installed, you can install the development version of the R package from GitHub via the following:

if (!require("remotes")) install.packages("remotes")
remotes::install_github("tensorflow/io", subdir = "R-package")

TensorFlow Version Compatibility

To ensure compatibility with TensorFlow, it is recommended to install a matching version of TensorFlow I/O according to the table below. You can find the list of releases here.

TensorFlow I/O Version TensorFlow Compatibility Release Date
0.17.1 2.4.x Apr 16, 2021
0.17.0 2.4.x Dec 14, 2020
0.16.0 2.3.x Oct 23, 2020
0.15.0 2.3.x Aug 03, 2020
0.14.0 2.2.x Jul 08, 2020
0.13.0 2.2.x May 10, 2020
0.12.0 2.1.x Feb 28, 2020
0.11.0 2.1.x Jan 10, 2020
0.10.0 2.0.x Dec 05, 2019
0.9.1 2.0.x Nov 15, 2019
0.9.0 2.0.x Oct 18, 2019
0.8.1 1.15.x Nov 15, 2019
0.8.0 1.15.x Oct 17, 2019
0.7.2 1.14.x Nov 15, 2019
0.7.1 1.14.x Oct 18, 2019
0.7.0 1.14.x Jul 14, 2019
0.6.0 1.13.x May 29, 2019
0.5.0 1.13.x Apr 12, 2019
0.4.0 1.13.x Mar 01, 2019
0.3.0 1.12.0 Feb 15, 2019
0.2.0 1.12.0 Jan 29, 2019
0.1.0 1.12.0 Dec 16, 2018

Performance Benchmarking

We use github-pages to document the results of API performance benchmarks. The benchmark job is triggered on every commit to master branch and facilitates tracking performance w.r.t commits.

Contributing

Tensorflow I/O is a community led open source project. As such, the project depends on public contributions, bug-fixes, and documentation. Please see:

Build Status and CI

Build Status
Linux CPU Python 2 Status
Linux CPU Python 3 Status
Linux GPU Python 2 Status
Linux GPU Python 3 Status

Because of manylinux2010 requirement, TensorFlow I/O is built with Ubuntu:16.04 + Developer Toolset 7 (GCC 7.3) on Linux. Configuration with Ubuntu 16.04 with Developer Toolset 7 is not exactly straightforward. If the system have docker installed, then the following command will automatically build manylinux2010 compatible whl package:

#!/usr/bin/env bash

ls dist/*
for f in dist/*.whl; do
  docker run -i --rm -v $PWD:/v -w /v --net=host quay.io/pypa/manylinux2010_x86_64 bash -x -e /v/tools/build/auditwheel repair --plat manylinux2010_x86_64 $f
done
sudo chown -R $(id -nu):$(id -ng) .
ls wheelhouse/*

It takes some time to build, but once complete, there will be python 3.5, 3.6, 3.7 compatible whl packages available in wheelhouse directory.

On macOS, the same command could be used. However, the script expects python in shell and will only generate a whl package that matches the version of python in shell. If you want to build a whl package for a specific python then you have to alias this version of python to python in shell. See .github/workflows/build.yml Auditwheel step for instructions how to do that.

Note the above command is also the command we use when releasing packages for Linux and macOS.

TensorFlow I/O uses both GitHub Workflows and Google CI (Kokoro) for continuous integration. GitHub Workflows is used for macOS build and test. Kokoro is used for Linux build and test. Again, because of the manylinux2010 requirement, on Linux whl packages are always built with Ubuntu 16.04 + Developer Toolset 7. Tests are done on a variatiy of systems with different python3 versions to ensure a good coverage:

Python Ubuntu 18.04 Ubuntu 20.04 macOS + osx9 Windows-2019
2.7 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: N/A
3.7 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: :heavy_check_mark:
3.8 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: :heavy_check_mark:

TensorFlow I/O has integrations with many systems and cloud vendors such as Prometheus, Apache Kafka, Apache Ignite, Google Cloud PubSub, AWS Kinesis, Microsoft Azure Storage, Alibaba Cloud OSS etc.

We tried our best to test against those systems in our continuous integration whenever possible. Some tests such as Prometheus, Kafka, and Ignite are done with live systems, meaning we install Prometheus/Kafka/Ignite on CI machine before the test is run. Some tests such as Kinesis, PubSub, and Azure Storage are done through official or non-official emulators. Offline tests are also performed whenever possible, though systems covered through offine tests may not have the same level of coverage as live systems or emulators.

Live System Emulator CI Integration Offline
Apache Kafka :heavy_check_mark: :heavy_check_mark:
Apache Ignite :heavy_check_mark: :heavy_check_mark:
Prometheus :heavy_check_mark: :heavy_check_mark:
Google PubSub :heavy_check_mark: :heavy_check_mark:
Azure Storage :heavy_check_mark: :heavy_check_mark:
AWS Kinesis :heavy_check_mark: :heavy_check_mark:
Alibaba Cloud OSS :heavy_check_mark:
Google BigTable/BigQuery to be added
Elasticsearch (experimental) :heavy_check_mark: :heavy_check_mark:
MongoDB (experimental) :heavy_check_mark: :heavy_check_mark:

References for emulators:

Community

Additional Information

License

Apache License 2.0

Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

tensorflow_io_nightly-0.18.0.dev20210511113714-cp39-cp39-win_amd64.whl (20.6 MB view details)

Uploaded CPython 3.9 Windows x86-64

tensorflow_io_nightly-0.18.0.dev20210511113714-cp39-cp39-macosx_10_14_x86_64.whl (21.1 MB view details)

Uploaded CPython 3.9 macOS 10.14+ x86-64

tensorflow_io_nightly-0.18.0.dev20210511113714-cp38-cp38-win_amd64.whl (20.6 MB view details)

Uploaded CPython 3.8 Windows x86-64

tensorflow_io_nightly-0.18.0.dev20210511113714-cp38-cp38-macosx_10_14_x86_64.whl (21.1 MB view details)

Uploaded CPython 3.8 macOS 10.14+ x86-64

tensorflow_io_nightly-0.18.0.dev20210511113714-cp37-cp37m-win_amd64.whl (20.6 MB view details)

Uploaded CPython 3.7m Windows x86-64

tensorflow_io_nightly-0.18.0.dev20210511113714-cp37-cp37m-macosx_10_14_x86_64.whl (21.1 MB view details)

Uploaded CPython 3.7m macOS 10.14+ x86-64

tensorflow_io_nightly-0.18.0.dev20210511113714-cp36-cp36m-win_amd64.whl (20.6 MB view details)

Uploaded CPython 3.6m Windows x86-64

tensorflow_io_nightly-0.18.0.dev20210511113714-cp36-cp36m-macosx_10_14_x86_64.whl (21.1 MB view details)

Uploaded CPython 3.6m macOS 10.14+ x86-64

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210511113714-cp39-cp39-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210511113714-cp39-cp39-win_amd64.whl
Algorithm Hash digest
SHA256 96cbf81a38372cc5f1a4c84455f75d3bdc0d0c28dca443399a253f0ebefbd938
MD5 91753b1448fd382996892ca0925fdaab
BLAKE2b-256 5fe17a344f4a670961dcf83cabfb31c2d35c656f5d3712753e9321b9ffe6584c

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210511113714-cp39-cp39-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210511113714-cp39-cp39-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 58fbbee73bdd9b457d942522992431dce354b70a0f87af71a5315d2f926269b3
MD5 01fe42ed203ffb8df8fece3dfba7ff10
BLAKE2b-256 543661ed29c6b12085fdc3255357f3a89d286e93be02d0937f06e1c6de8a1696

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210511113714-cp39-cp39-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210511113714-cp39-cp39-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 35831581879d1a36028b289299f29a7fd0d4189f3f620ce98eceb0beb55d7e6e
MD5 2132ccb25409fffbd55d4149a444af28
BLAKE2b-256 c07cb1235934f6b46436466fce7d0d172832bd51f8010232895f04b9948ad16b

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210511113714-cp38-cp38-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210511113714-cp38-cp38-win_amd64.whl
Algorithm Hash digest
SHA256 58ef26346e025a522e1456feaff51b1dd607eac91820afdbbea42bd10fa8c92f
MD5 9b14dc7a63fb2f424aea1c6e5de7de90
BLAKE2b-256 f3884718787042469785077339bc6ed9ce2bb8c9ecffa750326241e7415d147b

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210511113714-cp38-cp38-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210511113714-cp38-cp38-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 4272c41613f6fa2f2cb66f7d5e453dd4f5978399abb8c15ad9190d2aca076454
MD5 210ca87484df60c32b5dab9c91a8e0ee
BLAKE2b-256 4824e1b81228885d44ac36c76c511735ff0775be403fb6d6d83dba2e466b02e8

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210511113714-cp38-cp38-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210511113714-cp38-cp38-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 14cef7b3c6a252ce6cf67a1f907bb98e6fedfbb10a83e9e8b83ae5fb1b33f8d6
MD5 8f49023a6b7be22f983513757cb3aeb4
BLAKE2b-256 41707d5755a4acfe8b12d8186c99d3a501c77e5fe4d9bb66a563b033ac3a9262

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210511113714-cp37-cp37m-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210511113714-cp37-cp37m-win_amd64.whl
Algorithm Hash digest
SHA256 177e1c933002c59b74f69eb650ddbec2f8552a9376e8c1bc0811b3ea04082eed
MD5 1458b10c65a3635d190d971852c861b4
BLAKE2b-256 91bb330a11c40aa9b8c804da0e13367d4023439b99684a6607744a43d12dba12

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210511113714-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210511113714-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 c097e87092bea87ed0c2bbad88cdaaf0271f2c0e3167a989a9768657778522a3
MD5 04932b06473129c7455ddf1cb7cc0c1f
BLAKE2b-256 6d2f15c939e06cee7809eec5600ceb94593f6650605641244004b04d856b276a

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210511113714-cp37-cp37m-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210511113714-cp37-cp37m-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 5d92a8f4e780e767daf0d3136964705345b14e243dae1638f3816023c73a90c1
MD5 e9bfde1aec9e9a656b9041a3882d5396
BLAKE2b-256 abcd7c339d833f1d605de3258562d622453c72e94f68d83ff616997cde849702

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210511113714-cp36-cp36m-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210511113714-cp36-cp36m-win_amd64.whl
Algorithm Hash digest
SHA256 de8870f00d50779f2367ae455907e8511f29a5266413b110d89a863f8fb02f13
MD5 01ad9d1b43bb5e9dfb4425a55ad75135
BLAKE2b-256 744b81b6fbed4281c61b554589ecd3444fd2a007f1cd11e857e1a39cffa58553

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210511113714-cp36-cp36m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210511113714-cp36-cp36m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 53cace53f6d139a75127d96d4b69f6a062b6fdd62bef244d208700ca5dbd3653
MD5 ee0865b99b75baf45ffd42b62de175cf
BLAKE2b-256 8ef187406cff39b696b7fb89f1edfbd008b19c732113217bc3a421fc177c3294

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210511113714-cp36-cp36m-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210511113714-cp36-cp36m-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 8f94b38b4b4ff77a37f21d6c76bfebaeb7ae5b297721f342abb0fdd6191d90e9
MD5 64e09c0fbc3c44717f0e9efbdda717c3
BLAKE2b-256 d5ada52641953de6b92ef19e533666d88cec8ad050be6b11763962fc85c39cb9

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page