Skip to main content

TensorFlow IO

Project description




TensorFlow I/O

GitHub CI PyPI License Documentation

TensorFlow I/O is a collection of file systems and file formats that are not available in TensorFlow's built-in support. A full list of supported file systems and file formats by TensorFlow I/O can be found here.

The use of tensorflow-io is straightforward with keras. Below is an example to Get Started with TensorFlow with the data processing aspect replaced by tensorflow-io:

import tensorflow as tf
import tensorflow_io as tfio

# Read the MNIST data into the IODataset.
dataset_url = "https://storage.googleapis.com/cvdf-datasets/mnist/"
d_train = tfio.IODataset.from_mnist(
    dataset_url + "train-images-idx3-ubyte.gz",
    dataset_url + "train-labels-idx1-ubyte.gz",
)

# Shuffle the elements of the dataset.
d_train = d_train.shuffle(buffer_size=1024)

# By default image data is uint8, so convert to float32 using map().
d_train = d_train.map(lambda x, y: (tf.image.convert_image_dtype(x, tf.float32), y))

# prepare batches the data just like any other tf.data.Dataset
d_train = d_train.batch(32)

# Build the model.
model = tf.keras.models.Sequential(
    [
        tf.keras.layers.Flatten(input_shape=(28, 28)),
        tf.keras.layers.Dense(512, activation=tf.nn.relu),
        tf.keras.layers.Dropout(0.2),
        tf.keras.layers.Dense(10, activation=tf.nn.softmax),
    ]
)

# Compile the model.
model.compile(
    optimizer="adam", loss="sparse_categorical_crossentropy", metrics=["accuracy"]
)

# Fit the model.
model.fit(d_train, epochs=5, steps_per_epoch=200)

In the above MNIST example, the URL's to access the dataset files are passed directly to the tfio.IODataset.from_mnist API call. This is due to the inherent support that tensorflow-io provides for HTTP/HTTPS file system, thus eliminating the need for downloading and saving datasets on a local directory.

NOTE: Since tensorflow-io is able to detect and uncompress the MNIST dataset automatically if needed, we can pass the URL's for the compressed files (gzip) to the API call as is.

Please check the official documentation for more detailed and interesting usages of the package.

Installation

Python Package

The tensorflow-io Python package can be installed with pip directly using:

$ pip install tensorflow-io

People who are a little more adventurous can also try our nightly binaries:

$ pip install tensorflow-io-nightly

Docker Images

In addition to the pip packages, the docker images can be used to quickly get started.

For stable builds:

$ docker pull tfsigio/tfio:latest
$ docker run -it --rm --name tfio-latest tfsigio/tfio:latest

For nightly builds:

$ docker pull tfsigio/tfio:nightly
$ docker run -it --rm --name tfio-nightly tfsigio/tfio:nightly

R Package

Once the tensorflow-io Python package has been successfully installed, you can install the development version of the R package from GitHub via the following:

if (!require("remotes")) install.packages("remotes")
remotes::install_github("tensorflow/io", subdir = "R-package")

TensorFlow Version Compatibility

To ensure compatibility with TensorFlow, it is recommended to install a matching version of TensorFlow I/O according to the table below. You can find the list of releases here.

TensorFlow I/O Version TensorFlow Compatibility Release Date
0.17.1 2.4.x Apr 16, 2021
0.17.0 2.4.x Dec 14, 2020
0.16.0 2.3.x Oct 23, 2020
0.15.0 2.3.x Aug 03, 2020
0.14.0 2.2.x Jul 08, 2020
0.13.0 2.2.x May 10, 2020
0.12.0 2.1.x Feb 28, 2020
0.11.0 2.1.x Jan 10, 2020
0.10.0 2.0.x Dec 05, 2019
0.9.1 2.0.x Nov 15, 2019
0.9.0 2.0.x Oct 18, 2019
0.8.1 1.15.x Nov 15, 2019
0.8.0 1.15.x Oct 17, 2019
0.7.2 1.14.x Nov 15, 2019
0.7.1 1.14.x Oct 18, 2019
0.7.0 1.14.x Jul 14, 2019
0.6.0 1.13.x May 29, 2019
0.5.0 1.13.x Apr 12, 2019
0.4.0 1.13.x Mar 01, 2019
0.3.0 1.12.0 Feb 15, 2019
0.2.0 1.12.0 Jan 29, 2019
0.1.0 1.12.0 Dec 16, 2018

Performance Benchmarking

We use github-pages to document the results of API performance benchmarks. The benchmark job is triggered on every commit to master branch and facilitates tracking performance w.r.t commits.

Contributing

Tensorflow I/O is a community led open source project. As such, the project depends on public contributions, bug-fixes, and documentation. Please see:

Build Status and CI

Build Status
Linux CPU Python 2 Status
Linux CPU Python 3 Status
Linux GPU Python 2 Status
Linux GPU Python 3 Status

Because of manylinux2010 requirement, TensorFlow I/O is built with Ubuntu:16.04 + Developer Toolset 7 (GCC 7.3) on Linux. Configuration with Ubuntu 16.04 with Developer Toolset 7 is not exactly straightforward. If the system have docker installed, then the following command will automatically build manylinux2010 compatible whl package:

#!/usr/bin/env bash

ls dist/*
for f in dist/*.whl; do
  docker run -i --rm -v $PWD:/v -w /v --net=host quay.io/pypa/manylinux2010_x86_64 bash -x -e /v/tools/build/auditwheel repair --plat manylinux2010_x86_64 $f
done
sudo chown -R $(id -nu):$(id -ng) .
ls wheelhouse/*

It takes some time to build, but once complete, there will be python 3.5, 3.6, 3.7 compatible whl packages available in wheelhouse directory.

On macOS, the same command could be used. However, the script expects python in shell and will only generate a whl package that matches the version of python in shell. If you want to build a whl package for a specific python then you have to alias this version of python to python in shell. See .github/workflows/build.yml Auditwheel step for instructions how to do that.

Note the above command is also the command we use when releasing packages for Linux and macOS.

TensorFlow I/O uses both GitHub Workflows and Google CI (Kokoro) for continuous integration. GitHub Workflows is used for macOS build and test. Kokoro is used for Linux build and test. Again, because of the manylinux2010 requirement, on Linux whl packages are always built with Ubuntu 16.04 + Developer Toolset 7. Tests are done on a variatiy of systems with different python3 versions to ensure a good coverage:

Python Ubuntu 18.04 Ubuntu 20.04 macOS + osx9 Windows-2019
2.7 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: N/A
3.7 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: :heavy_check_mark:
3.8 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: :heavy_check_mark:

TensorFlow I/O has integrations with many systems and cloud vendors such as Prometheus, Apache Kafka, Apache Ignite, Google Cloud PubSub, AWS Kinesis, Microsoft Azure Storage, Alibaba Cloud OSS etc.

We tried our best to test against those systems in our continuous integration whenever possible. Some tests such as Prometheus, Kafka, and Ignite are done with live systems, meaning we install Prometheus/Kafka/Ignite on CI machine before the test is run. Some tests such as Kinesis, PubSub, and Azure Storage are done through official or non-official emulators. Offline tests are also performed whenever possible, though systems covered through offine tests may not have the same level of coverage as live systems or emulators.

Live System Emulator CI Integration Offline
Apache Kafka :heavy_check_mark: :heavy_check_mark:
Apache Ignite :heavy_check_mark: :heavy_check_mark:
Prometheus :heavy_check_mark: :heavy_check_mark:
Google PubSub :heavy_check_mark: :heavy_check_mark:
Azure Storage :heavy_check_mark: :heavy_check_mark:
AWS Kinesis :heavy_check_mark: :heavy_check_mark:
Alibaba Cloud OSS :heavy_check_mark:
Google BigTable/BigQuery to be added
Elasticsearch (experimental) :heavy_check_mark: :heavy_check_mark:
MongoDB (experimental) :heavy_check_mark: :heavy_check_mark:

References for emulators:

Community

Additional Information

License

Apache License 2.0

Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

tensorflow_io_nightly-0.18.0.dev20210506015400-cp39-cp39-win_amd64.whl (20.6 MB view details)

Uploaded CPython 3.9 Windows x86-64

tensorflow_io_nightly-0.18.0.dev20210506015400-cp39-cp39-macosx_10_14_x86_64.whl (21.1 MB view details)

Uploaded CPython 3.9 macOS 10.14+ x86-64

tensorflow_io_nightly-0.18.0.dev20210506015400-cp38-cp38-win_amd64.whl (20.6 MB view details)

Uploaded CPython 3.8 Windows x86-64

tensorflow_io_nightly-0.18.0.dev20210506015400-cp38-cp38-macosx_10_14_x86_64.whl (21.1 MB view details)

Uploaded CPython 3.8 macOS 10.14+ x86-64

tensorflow_io_nightly-0.18.0.dev20210506015400-cp37-cp37m-win_amd64.whl (20.6 MB view details)

Uploaded CPython 3.7m Windows x86-64

tensorflow_io_nightly-0.18.0.dev20210506015400-cp37-cp37m-macosx_10_14_x86_64.whl (21.1 MB view details)

Uploaded CPython 3.7m macOS 10.14+ x86-64

tensorflow_io_nightly-0.18.0.dev20210506015400-cp36-cp36m-win_amd64.whl (20.6 MB view details)

Uploaded CPython 3.6m Windows x86-64

tensorflow_io_nightly-0.18.0.dev20210506015400-cp36-cp36m-macosx_10_14_x86_64.whl (21.1 MB view details)

Uploaded CPython 3.6m macOS 10.14+ x86-64

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210506015400-cp39-cp39-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210506015400-cp39-cp39-win_amd64.whl
Algorithm Hash digest
SHA256 8dc98e6d25028b6efe7dd06bae3b358625e51d9705832f2f1022f225bb840e2c
MD5 0efcbd511bc1a55263adccc07f59c478
BLAKE2b-256 102c43075edcab3881c7d54fd80f5d729a32541966c8a4c0fc3dd86c7b3b9fdd

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210506015400-cp39-cp39-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210506015400-cp39-cp39-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 99f725399ce71ffb398ff70e1a5757e658cb0cc7105723836e0a7b1f3d3cdb73
MD5 29deea43b9c7e993c4ba1778320ea961
BLAKE2b-256 7582feb65d54a2d532a36548ab810a9bb1e961c61e1b755d4566ae3655693bd7

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210506015400-cp39-cp39-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210506015400-cp39-cp39-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 455ca3b93c7222a6239d373097cbc32434e1f5ecb061022265d8bf715db603ce
MD5 49484989a3c8c38042e5d5ba14058001
BLAKE2b-256 e5e29422b400da3ac3c1d0b1895c08756924d0c21bc03a229ec506aeb11abed0

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210506015400-cp38-cp38-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210506015400-cp38-cp38-win_amd64.whl
Algorithm Hash digest
SHA256 fcf3e035364dc535b714a43e5fa03608b1b76db5760aa2879bb06542ad0d0609
MD5 e8d30a6ef514dcb0208a3c7fae90fcea
BLAKE2b-256 035152763d2ec211fea8dcf8c6bb3b351df048510e05e848707b52bb0fea80a8

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210506015400-cp38-cp38-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210506015400-cp38-cp38-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 00102556c5f95dab975fd48820e65a75ad4f7e1eec00c9df9f5535ebdccfa389
MD5 db55ecc51a7537ae0cb8a2dad489bc32
BLAKE2b-256 9415c42abd8c030f16d02e6cfbe48a1ccd17dafc83c6c47bbcb902f043c3efee

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210506015400-cp38-cp38-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210506015400-cp38-cp38-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 d1312e5da485c7679e07fd1af3ec62b1c956b1f02842a784eac545185e104c36
MD5 87b4b0a6afa3c355f77e64a5848a6f27
BLAKE2b-256 b61f13e3f1b72786dbd0da45684f61087197a4904514968dc0ee6c229fa4a8a8

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210506015400-cp37-cp37m-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210506015400-cp37-cp37m-win_amd64.whl
Algorithm Hash digest
SHA256 03382cff22e768c1d63cee473d2abc665c4be66991752f07a4211b7ceb18a32e
MD5 26a27e11cd8be493608898427e1065c8
BLAKE2b-256 a208624a0c0ed3079e987fd857d09fddaff244678d750862571e2a137855eff8

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210506015400-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210506015400-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 d1778b3728d4cf87e886a85cb68336b9bf92250d91f41d7f2feee0fcd59572c7
MD5 6c0dbe807114f6b0cb4282f8a195d020
BLAKE2b-256 d43761747806945c90c5e9bbcfd5562e69e6e965d54195bd480e966ca9c672bc

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210506015400-cp37-cp37m-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210506015400-cp37-cp37m-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 3208a61c9165620731cb55d37d7648df3b82d49ae60fab118c091464571eab1f
MD5 b97a319fbf316464497d92bb39fd1afe
BLAKE2b-256 5688da8a3948a4e5644925f36a3948f891821afbb52cd95e81badd0044398a84

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210506015400-cp36-cp36m-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210506015400-cp36-cp36m-win_amd64.whl
Algorithm Hash digest
SHA256 3934077b655d523a434895b15879bc0859321ad271b5e8c56c1b0dbc5dcd05a6
MD5 c93bbf281abce734ef8831dcb664a8ba
BLAKE2b-256 5e575f109e026375634e2fbf15985d5ee2e1e8e2253bd485fcd0a5834b0a109e

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210506015400-cp36-cp36m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210506015400-cp36-cp36m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 cff3277484e7e566bc8b4c85a53689be700c82f4ee80cbaa37eaa37b4db0a164
MD5 4c766e48b72b7e0b70871c05ddf2ebac
BLAKE2b-256 18356e330c3d8681b6a6a1b227bc6e3f0f475cdc46aaead4eda85c7342634215

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210506015400-cp36-cp36m-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210506015400-cp36-cp36m-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 d038bd3d90e47aecf8ef7bf11a5746a2256a2a4c0e7345f89bea0ded69868116
MD5 5586643c9d786a5c4763705c2488603d
BLAKE2b-256 635f4448355c4e21044e8a93a1aa7479324e1afbcb2f94ce39ee1e2530d28e58

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page