Skip to main content

TensorFlow IO

Project description




TensorFlow I/O

GitHub CI PyPI License Documentation

TensorFlow I/O is a collection of file systems and file formats that are not available in TensorFlow's built-in support. A full list of supported file systems and file formats by TensorFlow I/O can be found here.

The use of tensorflow-io is straightforward with keras. Below is an example to Get Started with TensorFlow with the data processing aspect replaced by tensorflow-io:

import tensorflow as tf
import tensorflow_io as tfio

# Read the MNIST data into the IODataset.
dataset_url = "https://storage.googleapis.com/cvdf-datasets/mnist/"
d_train = tfio.IODataset.from_mnist(
    dataset_url + "train-images-idx3-ubyte.gz",
    dataset_url + "train-labels-idx1-ubyte.gz",
)

# Shuffle the elements of the dataset.
d_train = d_train.shuffle(buffer_size=1024)

# By default image data is uint8, so convert to float32 using map().
d_train = d_train.map(lambda x, y: (tf.image.convert_image_dtype(x, tf.float32), y))

# prepare batches the data just like any other tf.data.Dataset
d_train = d_train.batch(32)

# Build the model.
model = tf.keras.models.Sequential(
    [
        tf.keras.layers.Flatten(input_shape=(28, 28)),
        tf.keras.layers.Dense(512, activation=tf.nn.relu),
        tf.keras.layers.Dropout(0.2),
        tf.keras.layers.Dense(10, activation=tf.nn.softmax),
    ]
)

# Compile the model.
model.compile(
    optimizer="adam", loss="sparse_categorical_crossentropy", metrics=["accuracy"]
)

# Fit the model.
model.fit(d_train, epochs=5, steps_per_epoch=200)

In the above MNIST example, the URL's to access the dataset files are passed directly to the tfio.IODataset.from_mnist API call. This is due to the inherent support that tensorflow-io provides for HTTP/HTTPS file system, thus eliminating the need for downloading and saving datasets on a local directory.

NOTE: Since tensorflow-io is able to detect and uncompress the MNIST dataset automatically if needed, we can pass the URL's for the compressed files (gzip) to the API call as is.

Please check the official documentation for more detailed and interesting usages of the package.

Installation

Python Package

The tensorflow-io Python package can be installed with pip directly using:

$ pip install tensorflow-io

People who are a little more adventurous can also try our nightly binaries:

$ pip install tensorflow-io-nightly

Docker Images

In addition to the pip packages, the docker images can be used to quickly get started.

For stable builds:

$ docker pull tfsigio/tfio:latest
$ docker run -it --rm --name tfio-latest tfsigio/tfio:latest

For nightly builds:

$ docker pull tfsigio/tfio:nightly
$ docker run -it --rm --name tfio-nightly tfsigio/tfio:nightly

R Package

Once the tensorflow-io Python package has been successfully installed, you can install the development version of the R package from GitHub via the following:

if (!require("remotes")) install.packages("remotes")
remotes::install_github("tensorflow/io", subdir = "R-package")

TensorFlow Version Compatibility

To ensure compatibility with TensorFlow, it is recommended to install a matching version of TensorFlow I/O according to the table below. You can find the list of releases here.

TensorFlow I/O Version TensorFlow Compatibility Release Date
0.18.0 2.5.x May 13, 2021
0.17.1 2.4.x Apr 16, 2021
0.17.0 2.4.x Dec 14, 2020
0.16.0 2.3.x Oct 23, 2020
0.15.0 2.3.x Aug 03, 2020
0.14.0 2.2.x Jul 08, 2020
0.13.0 2.2.x May 10, 2020
0.12.0 2.1.x Feb 28, 2020
0.11.0 2.1.x Jan 10, 2020
0.10.0 2.0.x Dec 05, 2019
0.9.1 2.0.x Nov 15, 2019
0.9.0 2.0.x Oct 18, 2019
0.8.1 1.15.x Nov 15, 2019
0.8.0 1.15.x Oct 17, 2019
0.7.2 1.14.x Nov 15, 2019
0.7.1 1.14.x Oct 18, 2019
0.7.0 1.14.x Jul 14, 2019
0.6.0 1.13.x May 29, 2019
0.5.0 1.13.x Apr 12, 2019
0.4.0 1.13.x Mar 01, 2019
0.3.0 1.12.0 Feb 15, 2019
0.2.0 1.12.0 Jan 29, 2019
0.1.0 1.12.0 Dec 16, 2018

Performance Benchmarking

We use github-pages to document the results of API performance benchmarks. The benchmark job is triggered on every commit to master branch and facilitates tracking performance w.r.t commits.

Contributing

Tensorflow I/O is a community led open source project. As such, the project depends on public contributions, bug-fixes, and documentation. Please see:

Build Status and CI

Build Status
Linux CPU Python 2 Status
Linux CPU Python 3 Status
Linux GPU Python 2 Status
Linux GPU Python 3 Status

Because of manylinux2010 requirement, TensorFlow I/O is built with Ubuntu:16.04 + Developer Toolset 7 (GCC 7.3) on Linux. Configuration with Ubuntu 16.04 with Developer Toolset 7 is not exactly straightforward. If the system have docker installed, then the following command will automatically build manylinux2010 compatible whl package:

#!/usr/bin/env bash

ls dist/*
for f in dist/*.whl; do
  docker run -i --rm -v $PWD:/v -w /v --net=host quay.io/pypa/manylinux2010_x86_64 bash -x -e /v/tools/build/auditwheel repair --plat manylinux2010_x86_64 $f
done
sudo chown -R $(id -nu):$(id -ng) .
ls wheelhouse/*

It takes some time to build, but once complete, there will be python 3.5, 3.6, 3.7 compatible whl packages available in wheelhouse directory.

On macOS, the same command could be used. However, the script expects python in shell and will only generate a whl package that matches the version of python in shell. If you want to build a whl package for a specific python then you have to alias this version of python to python in shell. See .github/workflows/build.yml Auditwheel step for instructions how to do that.

Note the above command is also the command we use when releasing packages for Linux and macOS.

TensorFlow I/O uses both GitHub Workflows and Google CI (Kokoro) for continuous integration. GitHub Workflows is used for macOS build and test. Kokoro is used for Linux build and test. Again, because of the manylinux2010 requirement, on Linux whl packages are always built with Ubuntu 16.04 + Developer Toolset 7. Tests are done on a variatiy of systems with different python3 versions to ensure a good coverage:

Python Ubuntu 18.04 Ubuntu 20.04 macOS + osx9 Windows-2019
2.7 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: N/A
3.7 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: :heavy_check_mark:
3.8 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: :heavy_check_mark:

TensorFlow I/O has integrations with many systems and cloud vendors such as Prometheus, Apache Kafka, Apache Ignite, Google Cloud PubSub, AWS Kinesis, Microsoft Azure Storage, Alibaba Cloud OSS etc.

We tried our best to test against those systems in our continuous integration whenever possible. Some tests such as Prometheus, Kafka, and Ignite are done with live systems, meaning we install Prometheus/Kafka/Ignite on CI machine before the test is run. Some tests such as Kinesis, PubSub, and Azure Storage are done through official or non-official emulators. Offline tests are also performed whenever possible, though systems covered through offine tests may not have the same level of coverage as live systems or emulators.

Live System Emulator CI Integration Offline
Apache Kafka :heavy_check_mark: :heavy_check_mark:
Apache Ignite :heavy_check_mark: :heavy_check_mark:
Prometheus :heavy_check_mark: :heavy_check_mark:
Google PubSub :heavy_check_mark: :heavy_check_mark:
Azure Storage :heavy_check_mark: :heavy_check_mark:
AWS Kinesis :heavy_check_mark: :heavy_check_mark:
Alibaba Cloud OSS :heavy_check_mark:
Google BigTable/BigQuery to be added
Elasticsearch (experimental) :heavy_check_mark: :heavy_check_mark:
MongoDB (experimental) :heavy_check_mark: :heavy_check_mark:

References for emulators:

Community

Additional Information

License

Apache License 2.0

Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

tensorflow_io_nightly-0.18.0.dev20210520200035-cp39-cp39-win_amd64.whl (21.0 MB view details)

Uploaded CPython 3.9 Windows x86-64

tensorflow_io_nightly-0.18.0.dev20210520200035-cp39-cp39-macosx_10_14_x86_64.whl (22.7 MB view details)

Uploaded CPython 3.9 macOS 10.14+ x86-64

tensorflow_io_nightly-0.18.0.dev20210520200035-cp38-cp38-win_amd64.whl (21.0 MB view details)

Uploaded CPython 3.8 Windows x86-64

tensorflow_io_nightly-0.18.0.dev20210520200035-cp38-cp38-macosx_10_14_x86_64.whl (22.7 MB view details)

Uploaded CPython 3.8 macOS 10.14+ x86-64

tensorflow_io_nightly-0.18.0.dev20210520200035-cp37-cp37m-win_amd64.whl (21.0 MB view details)

Uploaded CPython 3.7m Windows x86-64

tensorflow_io_nightly-0.18.0.dev20210520200035-cp37-cp37m-macosx_10_14_x86_64.whl (22.7 MB view details)

Uploaded CPython 3.7m macOS 10.14+ x86-64

tensorflow_io_nightly-0.18.0.dev20210520200035-cp36-cp36m-win_amd64.whl (21.0 MB view details)

Uploaded CPython 3.6m Windows x86-64

tensorflow_io_nightly-0.18.0.dev20210520200035-cp36-cp36m-macosx_10_14_x86_64.whl (22.7 MB view details)

Uploaded CPython 3.6m macOS 10.14+ x86-64

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210520200035-cp39-cp39-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210520200035-cp39-cp39-win_amd64.whl
Algorithm Hash digest
SHA256 47307539a23ab971e8b4d8b251973cf19559f4076e3623b6970b86b58e6d9395
MD5 0d43df841c57b9affdba097f8671dc89
BLAKE2b-256 1a63752b722f246bae118ee0bd96b50e3b123edcd52f17abf8f583ce31c11ebb

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210520200035-cp39-cp39-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210520200035-cp39-cp39-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 86fe041d51f9b606a3f4c92c8540cfeab5a7c29d57ca3c49bea3b666b8a4a5fc
MD5 a776480177439de3cffeacd3c3e04608
BLAKE2b-256 62af7b78a9f5156336cb3d6d141ba893021bceaac9831ab354ccfd9711536249

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210520200035-cp39-cp39-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210520200035-cp39-cp39-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 4a3d9a35ca5fdc2fc44c7654128fd0d1f5ddcc6db6c2ca6fd8d0d18f1c8104ea
MD5 6ab2ffb1cf0be0e28a3703a1ce66777b
BLAKE2b-256 933626c5f0965b4213924be34a6d7323372992d03191c1b376f3d6766461abf6

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210520200035-cp38-cp38-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210520200035-cp38-cp38-win_amd64.whl
Algorithm Hash digest
SHA256 315aaa4daf03a4f5cfaf42fff1983b04c9c60240d5afcbb975d063bd2c204e03
MD5 1b23a838282ea9b2fe7f952609a5f13a
BLAKE2b-256 6104dd11a6c4bad7155df825a64dd1a46074cd06929c6dcd2bb002b98c6c02fd

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210520200035-cp38-cp38-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210520200035-cp38-cp38-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 befe4d59256175fd1dd4fe7f1d8fca147526b62de37b9bf3e375fa66d488787c
MD5 ffe7ba8f7795397cb8b42928a08fe18b
BLAKE2b-256 17d4ed8cd164cba4fb2e0b195aec0ce75ed41c1de898eeb6671ce3031dba9182

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210520200035-cp38-cp38-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210520200035-cp38-cp38-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 0f386796c45df893305d171a6c3be5850267b45d744c10051525ff84515804d5
MD5 bb16fab089c80fc4cd86c4f00aec8557
BLAKE2b-256 c26d56d74312dbdb894f4bfe771178ce0bd1a8eef816e34921001fff5add8133

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210520200035-cp37-cp37m-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210520200035-cp37-cp37m-win_amd64.whl
Algorithm Hash digest
SHA256 50fd79af23f3cece56ba896640eac3ba5c53ee12cb93b927db308357b755b978
MD5 48c4c6d5cab18baeecbe003752b629b3
BLAKE2b-256 78c33c5c6096e2f6e20087f071fcd8e29161b986a4416bd10f5f8eb90321e8ce

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210520200035-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210520200035-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 ca072969500143a10a7dd96ac56a7a41999aefd50356096cac1f3202a52b7b44
MD5 7fa3b93e1b24feee5efa083c25b17968
BLAKE2b-256 10d2235dbf6810edcf2321429cc4ac444b6c10d7387199849228c57726080233

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210520200035-cp37-cp37m-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210520200035-cp37-cp37m-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 b5f301318ea47e090b3dcfe93cb03ed1d67095c85e6a7b96f99d2098503f95b0
MD5 6e525bb934ab0dc62874c01a164d13d5
BLAKE2b-256 b8fe04615f5403328bce6681b7306a207f5f357dd6787364455b0c75525227ff

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210520200035-cp36-cp36m-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210520200035-cp36-cp36m-win_amd64.whl
Algorithm Hash digest
SHA256 152ce5a94b38cccecb5c24b6a49e6642c0cb1ac89bc59f68718e6313f3fe361a
MD5 20754c4dae85ce5f303268223d416a6d
BLAKE2b-256 cf0dfdf5e5f6f22b3bdbd83d91509a9e8fd6371c463828bca2acf724eb471924

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210520200035-cp36-cp36m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210520200035-cp36-cp36m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 9de449d0cbb06bfd449b16ef887d12b3bf0d5b1a2388920d4e40b2fe34ac7c2c
MD5 b6ee026095cbab19330b42aca223292a
BLAKE2b-256 555480ba1727ea51eeff21ee6aebd3bc0ecb513f1a7e44f4c8512582480a0636

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210520200035-cp36-cp36m-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210520200035-cp36-cp36m-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 108125a967a850a75d6332ff1706ace46034da06453b7396309f8bb087c2347b
MD5 47e2c7d391f064a3cb138d2093dfa687
BLAKE2b-256 cbecd6322f027a9b452e9d8b8bb863ac7becc34d7eca09d3b31ee1d256488371

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page