Skip to main content

TensorFlow IO

Project description




TensorFlow I/O

GitHub CI PyPI License Documentation

TensorFlow I/O is a collection of file systems and file formats that are not available in TensorFlow's built-in support. A full list of supported file systems and file formats by TensorFlow I/O can be found here.

The use of tensorflow-io is straightforward with keras. Below is an example to Get Started with TensorFlow with the data processing aspect replaced by tensorflow-io:

import tensorflow as tf
import tensorflow_io as tfio

# Read the MNIST data into the IODataset.
dataset_url = "https://storage.googleapis.com/cvdf-datasets/mnist/"
d_train = tfio.IODataset.from_mnist(
    dataset_url + "train-images-idx3-ubyte.gz",
    dataset_url + "train-labels-idx1-ubyte.gz",
)

# Shuffle the elements of the dataset.
d_train = d_train.shuffle(buffer_size=1024)

# By default image data is uint8, so convert to float32 using map().
d_train = d_train.map(lambda x, y: (tf.image.convert_image_dtype(x, tf.float32), y))

# prepare batches the data just like any other tf.data.Dataset
d_train = d_train.batch(32)

# Build the model.
model = tf.keras.models.Sequential(
    [
        tf.keras.layers.Flatten(input_shape=(28, 28)),
        tf.keras.layers.Dense(512, activation=tf.nn.relu),
        tf.keras.layers.Dropout(0.2),
        tf.keras.layers.Dense(10, activation=tf.nn.softmax),
    ]
)

# Compile the model.
model.compile(
    optimizer="adam", loss="sparse_categorical_crossentropy", metrics=["accuracy"]
)

# Fit the model.
model.fit(d_train, epochs=5, steps_per_epoch=200)

In the above MNIST example, the URL's to access the dataset files are passed directly to the tfio.IODataset.from_mnist API call. This is due to the inherent support that tensorflow-io provides for HTTP/HTTPS file system, thus eliminating the need for downloading and saving datasets on a local directory.

NOTE: Since tensorflow-io is able to detect and uncompress the MNIST dataset automatically if needed, we can pass the URL's for the compressed files (gzip) to the API call as is.

Please check the official documentation for more detailed and interesting usages of the package.

Installation

Python Package

The tensorflow-io Python package can be installed with pip directly using:

$ pip install tensorflow-io

People who are a little more adventurous can also try our nightly binaries:

$ pip install tensorflow-io-nightly

To ensure you have a version of TensorFlow that is compatible with TensorFlow-IO, you can specify the tensorflow extra requirement during install:

pip install tensorflow-io[tensorflow]

Similar extras exist for the tensorflow-gpu, tensorflow-cpu and tensorflow-rocm packages.

Docker Images

In addition to the pip packages, the docker images can be used to quickly get started.

For stable builds:

$ docker pull tfsigio/tfio:latest
$ docker run -it --rm --name tfio-latest tfsigio/tfio:latest

For nightly builds:

$ docker pull tfsigio/tfio:nightly
$ docker run -it --rm --name tfio-nightly tfsigio/tfio:nightly

R Package

Once the tensorflow-io Python package has been successfully installed, you can install the development version of the R package from GitHub via the following:

if (!require("remotes")) install.packages("remotes")
remotes::install_github("tensorflow/io", subdir = "R-package")

TensorFlow Version Compatibility

To ensure compatibility with TensorFlow, it is recommended to install a matching version of TensorFlow I/O according to the table below. You can find the list of releases here.

TensorFlow I/O Version TensorFlow Compatibility Release Date
0.23.0 2.7.x Dec 14, 2021
0.22.0 2.7.x Nov 10, 2021
0.21.0 2.6.x Sep 12, 2021
0.20.0 2.6.x Aug 11, 2021
0.19.1 2.5.x Jul 25, 2021
0.19.0 2.5.x Jun 25, 2021
0.18.0 2.5.x May 13, 2021
0.17.1 2.4.x Apr 16, 2021
0.17.0 2.4.x Dec 14, 2020
0.16.0 2.3.x Oct 23, 2020
0.15.0 2.3.x Aug 03, 2020
0.14.0 2.2.x Jul 08, 2020
0.13.0 2.2.x May 10, 2020
0.12.0 2.1.x Feb 28, 2020
0.11.0 2.1.x Jan 10, 2020
0.10.0 2.0.x Dec 05, 2019
0.9.1 2.0.x Nov 15, 2019
0.9.0 2.0.x Oct 18, 2019
0.8.1 1.15.x Nov 15, 2019
0.8.0 1.15.x Oct 17, 2019
0.7.2 1.14.x Nov 15, 2019
0.7.1 1.14.x Oct 18, 2019
0.7.0 1.14.x Jul 14, 2019
0.6.0 1.13.x May 29, 2019
0.5.0 1.13.x Apr 12, 2019
0.4.0 1.13.x Mar 01, 2019
0.3.0 1.12.0 Feb 15, 2019
0.2.0 1.12.0 Jan 29, 2019
0.1.0 1.12.0 Dec 16, 2018

Performance Benchmarking

We use github-pages to document the results of API performance benchmarks. The benchmark job is triggered on every commit to master branch and facilitates tracking performance w.r.t commits.

Contributing

Tensorflow I/O is a community led open source project. As such, the project depends on public contributions, bug-fixes, and documentation. Please see:

Build Status and CI

Build Status
Linux CPU Python 2 Status
Linux CPU Python 3 Status
Linux GPU Python 2 Status
Linux GPU Python 3 Status

Because of manylinux2010 requirement, TensorFlow I/O is built with Ubuntu:16.04 + Developer Toolset 7 (GCC 7.3) on Linux. Configuration with Ubuntu 16.04 with Developer Toolset 7 is not exactly straightforward. If the system have docker installed, then the following command will automatically build manylinux2010 compatible whl package:

#!/usr/bin/env bash

ls dist/*
for f in dist/*.whl; do
  docker run -i --rm -v $PWD:/v -w /v --net=host quay.io/pypa/manylinux2010_x86_64 bash -x -e /v/tools/build/auditwheel repair --plat manylinux2010_x86_64 $f
done
sudo chown -R $(id -nu):$(id -ng) .
ls wheelhouse/*

It takes some time to build, but once complete, there will be python 3.5, 3.6, 3.7 compatible whl packages available in wheelhouse directory.

On macOS, the same command could be used. However, the script expects python in shell and will only generate a whl package that matches the version of python in shell. If you want to build a whl package for a specific python then you have to alias this version of python to python in shell. See .github/workflows/build.yml Auditwheel step for instructions how to do that.

Note the above command is also the command we use when releasing packages for Linux and macOS.

TensorFlow I/O uses both GitHub Workflows and Google CI (Kokoro) for continuous integration. GitHub Workflows is used for macOS build and test. Kokoro is used for Linux build and test. Again, because of the manylinux2010 requirement, on Linux whl packages are always built with Ubuntu 16.04 + Developer Toolset 7. Tests are done on a variatiy of systems with different python3 versions to ensure a good coverage:

Python Ubuntu 18.04 Ubuntu 20.04 macOS + osx9 Windows-2019
2.7 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: N/A
3.7 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: :heavy_check_mark:
3.8 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: :heavy_check_mark:

TensorFlow I/O has integrations with many systems and cloud vendors such as Prometheus, Apache Kafka, Apache Ignite, Google Cloud PubSub, AWS Kinesis, Microsoft Azure Storage, Alibaba Cloud OSS etc.

We tried our best to test against those systems in our continuous integration whenever possible. Some tests such as Prometheus, Kafka, and Ignite are done with live systems, meaning we install Prometheus/Kafka/Ignite on CI machine before the test is run. Some tests such as Kinesis, PubSub, and Azure Storage are done through official or non-official emulators. Offline tests are also performed whenever possible, though systems covered through offine tests may not have the same level of coverage as live systems or emulators.

Live System Emulator CI Integration Offline
Apache Kafka :heavy_check_mark: :heavy_check_mark:
Apache Ignite :heavy_check_mark: :heavy_check_mark:
Prometheus :heavy_check_mark: :heavy_check_mark:
Google PubSub :heavy_check_mark: :heavy_check_mark:
Azure Storage :heavy_check_mark: :heavy_check_mark:
AWS Kinesis :heavy_check_mark: :heavy_check_mark:
Alibaba Cloud OSS :heavy_check_mark:
Google BigTable/BigQuery to be added
Elasticsearch (experimental) :heavy_check_mark: :heavy_check_mark:
MongoDB (experimental) :heavy_check_mark: :heavy_check_mark:

References for emulators:

Community

Additional Information

License

Apache License 2.0

Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

tensorflow_io_nightly-0.23.0.dev20211213191445-cp310-cp310-win_amd64.whl (21.6 MB view details)

Uploaded CPython 3.10 Windows x86-64

tensorflow_io_nightly-0.23.0.dev20211213191445-cp310-cp310-macosx_10_14_x86_64.whl (23.8 MB view details)

Uploaded CPython 3.10 macOS 10.14+ x86-64

tensorflow_io_nightly-0.23.0.dev20211213191445-cp39-cp39-win_amd64.whl (21.6 MB view details)

Uploaded CPython 3.9 Windows x86-64

tensorflow_io_nightly-0.23.0.dev20211213191445-cp39-cp39-macosx_10_14_x86_64.whl (23.8 MB view details)

Uploaded CPython 3.9 macOS 10.14+ x86-64

tensorflow_io_nightly-0.23.0.dev20211213191445-cp38-cp38-win_amd64.whl (21.6 MB view details)

Uploaded CPython 3.8 Windows x86-64

tensorflow_io_nightly-0.23.0.dev20211213191445-cp38-cp38-macosx_10_14_x86_64.whl (23.8 MB view details)

Uploaded CPython 3.8 macOS 10.14+ x86-64

tensorflow_io_nightly-0.23.0.dev20211213191445-cp37-cp37m-win_amd64.whl (21.6 MB view details)

Uploaded CPython 3.7m Windows x86-64

tensorflow_io_nightly-0.23.0.dev20211213191445-cp37-cp37m-macosx_10_14_x86_64.whl (23.8 MB view details)

Uploaded CPython 3.7m macOS 10.14+ x86-64

File details

Details for the file tensorflow_io_nightly-0.23.0.dev20211213191445-cp310-cp310-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.23.0.dev20211213191445-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 7f57364adbf5c9435420dfe5204cf341848cf41d6fcc9146070a594750b2a2fa
MD5 cdbf6c8601a4f8c54500f5060d19189f
BLAKE2b-256 9d286187e60bd9cc4c6e9a9a45f69dbe24d133bb8a41d253e7fa1dcfd5e6318b

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.23.0.dev20211213191445-cp310-cp310-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.23.0.dev20211213191445-cp310-cp310-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 b03db7ca920976022db5c0a20027474ccef904d8943d26d580e763cf021faa36
MD5 62346a810791cbd25d22ab76419409f0
BLAKE2b-256 9bf465a41786817d9147bbd003bddb84b96d2b9970913b9396d2eb081a5a64c8

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.23.0.dev20211213191445-cp310-cp310-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.23.0.dev20211213191445-cp310-cp310-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 935b1427e5c1be5d9f64aeb60ab3344aafb4fac8ea549aa06bc9427c5017c809
MD5 4a8f4806777de3baef0c78c7c47e5a31
BLAKE2b-256 0c109fee615dff96474da3063f4f8ee5a1e46a5b6d70e47f015a698ad361aa5e

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.23.0.dev20211213191445-cp39-cp39-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.23.0.dev20211213191445-cp39-cp39-win_amd64.whl
Algorithm Hash digest
SHA256 10655be2bb6b6df67dac187496378681541c390d7d36d55c5a87dcacfa17c902
MD5 0ae75de6f947683d0264f8cc8e2e424b
BLAKE2b-256 67c1dde3742ef33b95068661ed27d6ad0a96bec1bef2a55109baa4251acf687f

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.23.0.dev20211213191445-cp39-cp39-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.23.0.dev20211213191445-cp39-cp39-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 5088ef5dabe5807f4e28522685febcb0a8900f0a2e2fc9c2b262c9468ddfb28e
MD5 7735343b91a2aaf56f2e9ab91041e161
BLAKE2b-256 db888596064d9a956a24f488ab907e412c687f8bd460fb0cbfa806256dc91f4c

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.23.0.dev20211213191445-cp39-cp39-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.23.0.dev20211213191445-cp39-cp39-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 7b93bc8fc14a04e8bc2c74186f8489ef93b7fe35a9c3e1f4d9d4eddb1f76bb9e
MD5 fbe2387fbed8d66c845771cfc507e906
BLAKE2b-256 41ab28a9dd546cf4b2d630a014d1f1771c6bf9768c7520644eaaa77c5ec29052

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.23.0.dev20211213191445-cp38-cp38-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.23.0.dev20211213191445-cp38-cp38-win_amd64.whl
Algorithm Hash digest
SHA256 ccc466ff49d3c9d3ba84aee17b8191d49ae690cfb9dcb54e5607adccd55c33ef
MD5 e09ab79be6f11fa878190a20076d4904
BLAKE2b-256 6e4421e2008e81de1a4d59f1aaa6fd1be30e5c06cab20d568d07bb8bee6853f5

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.23.0.dev20211213191445-cp38-cp38-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.23.0.dev20211213191445-cp38-cp38-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 b6b43bf9b55146ddd60747fcef83481d040feb317dd2fdc4ac4c0094d8919332
MD5 772f9ad2cf8df5d99f2511f6bda5c387
BLAKE2b-256 21ada7705343a8846c0987c048eeeb54aab7ba88bbdd07f1ed72b0475654bbca

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.23.0.dev20211213191445-cp38-cp38-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.23.0.dev20211213191445-cp38-cp38-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 ccb609a3c311c9f7cac12ebb446a25a4fc55db01fc52637a7736baa8b7f93b54
MD5 5fcc3dd50ca9f24171883e64b405bb9e
BLAKE2b-256 ac5e56ade1343fb8bd2e0783b7ccfca9f279b0997d58a48b179a4e625531cd29

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.23.0.dev20211213191445-cp37-cp37m-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.23.0.dev20211213191445-cp37-cp37m-win_amd64.whl
Algorithm Hash digest
SHA256 8d3b514bba91f2c3e666f86078c76af0a5a8a36de040a80b9a815b13350e6179
MD5 12c823fec5ba5ab6a3243050eb59d973
BLAKE2b-256 c3ca9a616340b607f3ed5184168ef76ecfbba7a350ebc7595defa83b51146738

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.23.0.dev20211213191445-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.23.0.dev20211213191445-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 2ad31e5920291a59386487765e332c05ab6314259ed7ee56eeb400e0af9a223b
MD5 d30385796c76eef706b0edd69df28c93
BLAKE2b-256 f70c26bc629991d8d5423588e223e20ad96726736a668fbce2fc0fe6a301115e

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.23.0.dev20211213191445-cp37-cp37m-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.23.0.dev20211213191445-cp37-cp37m-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 33c406b442c3b68a2d62358c1c006b934c875f576073ff60c5564bdd8075df40
MD5 d3ece36ed855584a50fa89f5284750ba
BLAKE2b-256 846c681c0ca57943de2faddc99d29bdfaec5f5dc656b55809ca695f1f9da715c

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page