Skip to main content

TensorFlow IO

Project description




TensorFlow I/O

GitHub CI PyPI License Documentation

TensorFlow I/O is a collection of file systems and file formats that are not available in TensorFlow's built-in support. A full list of supported file systems and file formats by TensorFlow I/O can be found here.

The use of tensorflow-io is straightforward with keras. Below is an example to Get Started with TensorFlow with the data processing aspect replaced by tensorflow-io:

import tensorflow as tf
import tensorflow_io as tfio

# Read the MNIST data into the IODataset.
dataset_url = "https://storage.googleapis.com/cvdf-datasets/mnist/"
d_train = tfio.IODataset.from_mnist(
    dataset_url + "train-images-idx3-ubyte.gz",
    dataset_url + "train-labels-idx1-ubyte.gz",
)

# Shuffle the elements of the dataset.
d_train = d_train.shuffle(buffer_size=1024)

# By default image data is uint8, so convert to float32 using map().
d_train = d_train.map(lambda x, y: (tf.image.convert_image_dtype(x, tf.float32), y))

# prepare batches the data just like any other tf.data.Dataset
d_train = d_train.batch(32)

# Build the model.
model = tf.keras.models.Sequential(
    [
        tf.keras.layers.Flatten(input_shape=(28, 28)),
        tf.keras.layers.Dense(512, activation=tf.nn.relu),
        tf.keras.layers.Dropout(0.2),
        tf.keras.layers.Dense(10, activation=tf.nn.softmax),
    ]
)

# Compile the model.
model.compile(
    optimizer="adam", loss="sparse_categorical_crossentropy", metrics=["accuracy"]
)

# Fit the model.
model.fit(d_train, epochs=5, steps_per_epoch=200)

In the above MNIST example, the URL's to access the dataset files are passed directly to the tfio.IODataset.from_mnist API call. This is due to the inherent support that tensorflow-io provides for HTTP/HTTPS file system, thus eliminating the need for downloading and saving datasets on a local directory.

NOTE: Since tensorflow-io is able to detect and uncompress the MNIST dataset automatically if needed, we can pass the URL's for the compressed files (gzip) to the API call as is.

Please check the official documentation for more detailed and interesting usages of the package.

Installation

Python Package

The tensorflow-io Python package can be installed with pip directly using:

$ pip install tensorflow-io

People who are a little more adventurous can also try our nightly binaries:

$ pip install tensorflow-io-nightly

To ensure you have a version of TensorFlow that is compatible with TensorFlow-IO, you can specify the tensorflow extra requirement during install:

pip install tensorflow-io[tensorflow]

Similar extras exist for the tensorflow-gpu, tensorflow-cpu and tensorflow-rocm packages.

Docker Images

In addition to the pip packages, the docker images can be used to quickly get started.

For stable builds:

$ docker pull tfsigio/tfio:latest
$ docker run -it --rm --name tfio-latest tfsigio/tfio:latest

For nightly builds:

$ docker pull tfsigio/tfio:nightly
$ docker run -it --rm --name tfio-nightly tfsigio/tfio:nightly

R Package

Once the tensorflow-io Python package has been successfully installed, you can install the development version of the R package from GitHub via the following:

if (!require("remotes")) install.packages("remotes")
remotes::install_github("tensorflow/io", subdir = "R-package")

TensorFlow Version Compatibility

To ensure compatibility with TensorFlow, it is recommended to install a matching version of TensorFlow I/O according to the table below. You can find the list of releases here.

TensorFlow I/O Version TensorFlow Compatibility Release Date
0.24.0 2.8.x Feb 04, 2022
0.23.1 2.7.x Dec 15, 2021
0.23.0 2.7.x Dec 14, 2021
0.22.0 2.7.x Nov 10, 2021
0.21.0 2.6.x Sep 12, 2021
0.20.0 2.6.x Aug 11, 2021
0.19.1 2.5.x Jul 25, 2021
0.19.0 2.5.x Jun 25, 2021
0.18.0 2.5.x May 13, 2021
0.17.1 2.4.x Apr 16, 2021
0.17.0 2.4.x Dec 14, 2020
0.16.0 2.3.x Oct 23, 2020
0.15.0 2.3.x Aug 03, 2020
0.14.0 2.2.x Jul 08, 2020
0.13.0 2.2.x May 10, 2020
0.12.0 2.1.x Feb 28, 2020
0.11.0 2.1.x Jan 10, 2020
0.10.0 2.0.x Dec 05, 2019
0.9.1 2.0.x Nov 15, 2019
0.9.0 2.0.x Oct 18, 2019
0.8.1 1.15.x Nov 15, 2019
0.8.0 1.15.x Oct 17, 2019
0.7.2 1.14.x Nov 15, 2019
0.7.1 1.14.x Oct 18, 2019
0.7.0 1.14.x Jul 14, 2019
0.6.0 1.13.x May 29, 2019
0.5.0 1.13.x Apr 12, 2019
0.4.0 1.13.x Mar 01, 2019
0.3.0 1.12.0 Feb 15, 2019
0.2.0 1.12.0 Jan 29, 2019
0.1.0 1.12.0 Dec 16, 2018

Performance Benchmarking

We use github-pages to document the results of API performance benchmarks. The benchmark job is triggered on every commit to master branch and facilitates tracking performance w.r.t commits.

Contributing

Tensorflow I/O is a community led open source project. As such, the project depends on public contributions, bug-fixes, and documentation. Please see:

Build Status and CI

Build Status
Linux CPU Python 2 Status
Linux CPU Python 3 Status
Linux GPU Python 2 Status
Linux GPU Python 3 Status

Because of manylinux2010 requirement, TensorFlow I/O is built with Ubuntu:16.04 + Developer Toolset 7 (GCC 7.3) on Linux. Configuration with Ubuntu 16.04 with Developer Toolset 7 is not exactly straightforward. If the system have docker installed, then the following command will automatically build manylinux2010 compatible whl package:

#!/usr/bin/env bash

ls dist/*
for f in dist/*.whl; do
  docker run -i --rm -v $PWD:/v -w /v --net=host quay.io/pypa/manylinux2010_x86_64 bash -x -e /v/tools/build/auditwheel repair --plat manylinux2010_x86_64 $f
done
sudo chown -R $(id -nu):$(id -ng) .
ls wheelhouse/*

It takes some time to build, but once complete, there will be python 3.5, 3.6, 3.7 compatible whl packages available in wheelhouse directory.

On macOS, the same command could be used. However, the script expects python in shell and will only generate a whl package that matches the version of python in shell. If you want to build a whl package for a specific python then you have to alias this version of python to python in shell. See .github/workflows/build.yml Auditwheel step for instructions how to do that.

Note the above command is also the command we use when releasing packages for Linux and macOS.

TensorFlow I/O uses both GitHub Workflows and Google CI (Kokoro) for continuous integration. GitHub Workflows is used for macOS build and test. Kokoro is used for Linux build and test. Again, because of the manylinux2010 requirement, on Linux whl packages are always built with Ubuntu 16.04 + Developer Toolset 7. Tests are done on a variatiy of systems with different python3 versions to ensure a good coverage:

Python Ubuntu 18.04 Ubuntu 20.04 macOS + osx9 Windows-2019
2.7 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: N/A
3.7 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: :heavy_check_mark:
3.8 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: :heavy_check_mark:

TensorFlow I/O has integrations with many systems and cloud vendors such as Prometheus, Apache Kafka, Apache Ignite, Google Cloud PubSub, AWS Kinesis, Microsoft Azure Storage, Alibaba Cloud OSS etc.

We tried our best to test against those systems in our continuous integration whenever possible. Some tests such as Prometheus, Kafka, and Ignite are done with live systems, meaning we install Prometheus/Kafka/Ignite on CI machine before the test is run. Some tests such as Kinesis, PubSub, and Azure Storage are done through official or non-official emulators. Offline tests are also performed whenever possible, though systems covered through offine tests may not have the same level of coverage as live systems or emulators.

Live System Emulator CI Integration Offline
Apache Kafka :heavy_check_mark: :heavy_check_mark:
Apache Ignite :heavy_check_mark: :heavy_check_mark:
Prometheus :heavy_check_mark: :heavy_check_mark:
Google PubSub :heavy_check_mark: :heavy_check_mark:
Azure Storage :heavy_check_mark: :heavy_check_mark:
AWS Kinesis :heavy_check_mark: :heavy_check_mark:
Alibaba Cloud OSS :heavy_check_mark:
Google BigTable/BigQuery to be added
Elasticsearch (experimental) :heavy_check_mark: :heavy_check_mark:
MongoDB (experimental) :heavy_check_mark: :heavy_check_mark:

References for emulators:

Community

Additional Information

License

Apache License 2.0

Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

tensorflow_io_nightly-0.24.0.dev20220406151216-cp310-cp310-win_amd64.whl (21.8 MB view details)

Uploaded CPython 3.10 Windows x86-64

tensorflow_io_nightly-0.24.0.dev20220406151216-cp310-cp310-macosx_10_14_x86_64.whl (24.1 MB view details)

Uploaded CPython 3.10 macOS 10.14+ x86-64

tensorflow_io_nightly-0.24.0.dev20220406151216-cp39-cp39-win_amd64.whl (21.8 MB view details)

Uploaded CPython 3.9 Windows x86-64

tensorflow_io_nightly-0.24.0.dev20220406151216-cp39-cp39-macosx_10_14_x86_64.whl (24.1 MB view details)

Uploaded CPython 3.9 macOS 10.14+ x86-64

tensorflow_io_nightly-0.24.0.dev20220406151216-cp38-cp38-win_amd64.whl (21.8 MB view details)

Uploaded CPython 3.8 Windows x86-64

tensorflow_io_nightly-0.24.0.dev20220406151216-cp38-cp38-macosx_10_14_x86_64.whl (24.1 MB view details)

Uploaded CPython 3.8 macOS 10.14+ x86-64

tensorflow_io_nightly-0.24.0.dev20220406151216-cp37-cp37m-win_amd64.whl (21.8 MB view details)

Uploaded CPython 3.7m Windows x86-64

tensorflow_io_nightly-0.24.0.dev20220406151216-cp37-cp37m-macosx_10_14_x86_64.whl (24.1 MB view details)

Uploaded CPython 3.7m macOS 10.14+ x86-64

File details

Details for the file tensorflow_io_nightly-0.24.0.dev20220406151216-cp310-cp310-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.24.0.dev20220406151216-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 6e3cc634cee712c2e2605e4891f89753042b7cfcd10ef652ca07707bf09f130c
MD5 ff861f145e62e064026fdbb3dfc57abe
BLAKE2b-256 64bbfc1c9783e9ccdcb844149f26f29cef05ed820421f20f26ea2c38bfbdd87f

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.24.0.dev20220406151216-cp310-cp310-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.24.0.dev20220406151216-cp310-cp310-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 13bb658625da56d4a31bfff4860636527073dfa46709e70f0f137e66a11908d8
MD5 40206973e0b9270fc533b4b1b42bb685
BLAKE2b-256 22edb05bd0ece12bae3bc8166f0b3269dd777d8f5670c6198c6dc70dccff8bff

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.24.0.dev20220406151216-cp310-cp310-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.24.0.dev20220406151216-cp310-cp310-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 b7165203bb78da5d9411528f4bf02f8c03d6a9a18a2d4db9ffd65c092bd824ab
MD5 a44132dc4f8dd579ce348991baa5b6df
BLAKE2b-256 25869338b8df38e6c42cccb6993b7c919b04e0a19564c954beaa3b2ce702eb54

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.24.0.dev20220406151216-cp39-cp39-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.24.0.dev20220406151216-cp39-cp39-win_amd64.whl
Algorithm Hash digest
SHA256 8587259fb9b70a57ca5bc9c5bfeae952e385bc6cfe20bf94ca24c8950991a898
MD5 6a9d6e56c99c233a9b0b29c00fed7741
BLAKE2b-256 e49618db7cc10b8d956925bb8dcc7e7e326b459781d69a94c13c0250dd0f07b7

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.24.0.dev20220406151216-cp39-cp39-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.24.0.dev20220406151216-cp39-cp39-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 a3db3694653dcb0a750fa0de4290ae371c6d180b7c41417d1d818a01a0ac7e6c
MD5 384d8c54342d045d8adbe63f9db55ce3
BLAKE2b-256 e055fe8aa9b73a8448337cccc844ee67f24d18949d724e123888f53aac6c01f3

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.24.0.dev20220406151216-cp39-cp39-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.24.0.dev20220406151216-cp39-cp39-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 8a1f24dd9a22fbfc7e7e60d5b5b9425be408378dc5708fd72866b116a07a670b
MD5 72992a0730f3be1314e6012f441af138
BLAKE2b-256 f194cbf5251abad538abb55d270f019011a0df8e139fc51138f72824b675d125

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.24.0.dev20220406151216-cp38-cp38-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.24.0.dev20220406151216-cp38-cp38-win_amd64.whl
Algorithm Hash digest
SHA256 488df7fa417a89698af7b238e28f450a45eed6925ea3f89f3622b67aef814e83
MD5 e8d409cd8ec65a65e78977bc89c07279
BLAKE2b-256 34f323c4fce0de228bcc9f50e6152f94b8e4c1eb1bd4d102988dc2f2c68b1a5b

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.24.0.dev20220406151216-cp38-cp38-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.24.0.dev20220406151216-cp38-cp38-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 13f4898bd58a0cc8c26a04473b39fd87913bf7e9cf8bc1aa883f5caeb37b7e57
MD5 77055266172be0d4ee1fac0e3e1671a9
BLAKE2b-256 0245b72e610dc0cf0e376918d115b48e1ac15eaacfbb3e4bc7fd1d95e112ed48

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.24.0.dev20220406151216-cp38-cp38-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.24.0.dev20220406151216-cp38-cp38-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 c0733293571cb01ebe3218d7b7060d905532b9da413caaf5d99c0545b731c205
MD5 2525075dcf9bd40e4b7829f89bccdf3f
BLAKE2b-256 de8df12963e2741b851b6a563c191ef09e60ef6c2b673e31d5e2a8a7448aad1a

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.24.0.dev20220406151216-cp37-cp37m-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.24.0.dev20220406151216-cp37-cp37m-win_amd64.whl
Algorithm Hash digest
SHA256 15a3deaef1446061acd1e1049accdf04dd9bc4ce1fc01c60b75718a8d7f5f10b
MD5 dca170270aace7f4e990e2f90b45a77e
BLAKE2b-256 b483d881ff9c1b62365776ba282ef7fd1eb65efc8080003ac953aaf6140935cc

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.24.0.dev20220406151216-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.24.0.dev20220406151216-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 f041f7e69ad7b605463308e0699ba78f713b45221876001abb0e88d2359911a5
MD5 0afc603bf241c4f7a1c75dc465b0100e
BLAKE2b-256 dc60e7045e1fda1e4d02ddd0effaa5bfc9140307d7196907a6d46fd99434721a

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.24.0.dev20220406151216-cp37-cp37m-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.24.0.dev20220406151216-cp37-cp37m-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 b5ad172d0f65fd67337d9a7afc6550714d01eeca7a3c47f663358431746ecee2
MD5 056ab7703b91bef49a540ec795686913
BLAKE2b-256 86f5608a8fcb548a6ec92fce4cd491d1cb1014d5f98369b68d2eb4099791f920

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page