Skip to main content

TensorFlow IO

Project description




TensorFlow I/O

GitHub CI PyPI License Documentation

TensorFlow I/O is a collection of file systems and file formats that are not available in TensorFlow's built-in support. A full list of supported file systems and file formats by TensorFlow I/O can be found here.

The use of tensorflow-io is straightforward with keras. Below is an example to Get Started with TensorFlow with the data processing aspect replaced by tensorflow-io:

import tensorflow as tf
import tensorflow_io as tfio

# Read the MNIST data into the IODataset.
dataset_url = "https://storage.googleapis.com/cvdf-datasets/mnist/"
d_train = tfio.IODataset.from_mnist(
    dataset_url + "train-images-idx3-ubyte.gz",
    dataset_url + "train-labels-idx1-ubyte.gz",
)

# Shuffle the elements of the dataset.
d_train = d_train.shuffle(buffer_size=1024)

# By default image data is uint8, so convert to float32 using map().
d_train = d_train.map(lambda x, y: (tf.image.convert_image_dtype(x, tf.float32), y))

# prepare batches the data just like any other tf.data.Dataset
d_train = d_train.batch(32)

# Build the model.
model = tf.keras.models.Sequential(
    [
        tf.keras.layers.Flatten(input_shape=(28, 28)),
        tf.keras.layers.Dense(512, activation=tf.nn.relu),
        tf.keras.layers.Dropout(0.2),
        tf.keras.layers.Dense(10, activation=tf.nn.softmax),
    ]
)

# Compile the model.
model.compile(
    optimizer="adam", loss="sparse_categorical_crossentropy", metrics=["accuracy"]
)

# Fit the model.
model.fit(d_train, epochs=5, steps_per_epoch=200)

In the above MNIST example, the URL's to access the dataset files are passed directly to the tfio.IODataset.from_mnist API call. This is due to the inherent support that tensorflow-io provides for HTTP/HTTPS file system, thus eliminating the need for downloading and saving datasets on a local directory.

NOTE: Since tensorflow-io is able to detect and uncompress the MNIST dataset automatically if needed, we can pass the URL's for the compressed files (gzip) to the API call as is.

Please check the official documentation for more detailed and interesting usages of the package.

Installation

Python Package

The tensorflow-io Python package can be installed with pip directly using:

$ pip install tensorflow-io

People who are a little more adventurous can also try our nightly binaries:

$ pip install tensorflow-io-nightly

To ensure you have a version of TensorFlow that is compatible with TensorFlow-IO, you can specify the tensorflow extra requirement during install:

pip install tensorflow-io[tensorflow]

Similar extras exist for the tensorflow-gpu, tensorflow-cpu and tensorflow-rocm packages.

Docker Images

In addition to the pip packages, the docker images can be used to quickly get started.

For stable builds:

$ docker pull tfsigio/tfio:latest
$ docker run -it --rm --name tfio-latest tfsigio/tfio:latest

For nightly builds:

$ docker pull tfsigio/tfio:nightly
$ docker run -it --rm --name tfio-nightly tfsigio/tfio:nightly

R Package

Once the tensorflow-io Python package has been successfully installed, you can install the development version of the R package from GitHub via the following:

if (!require("remotes")) install.packages("remotes")
remotes::install_github("tensorflow/io", subdir = "R-package")

TensorFlow Version Compatibility

To ensure compatibility with TensorFlow, it is recommended to install a matching version of TensorFlow I/O according to the table below. You can find the list of releases here.

TensorFlow I/O Version TensorFlow Compatibility Release Date
0.23.1 2.7.x Dec 15, 2021
0.23.0 2.7.x Dec 14, 2021
0.22.0 2.7.x Nov 10, 2021
0.21.0 2.6.x Sep 12, 2021
0.20.0 2.6.x Aug 11, 2021
0.19.1 2.5.x Jul 25, 2021
0.19.0 2.5.x Jun 25, 2021
0.18.0 2.5.x May 13, 2021
0.17.1 2.4.x Apr 16, 2021
0.17.0 2.4.x Dec 14, 2020
0.16.0 2.3.x Oct 23, 2020
0.15.0 2.3.x Aug 03, 2020
0.14.0 2.2.x Jul 08, 2020
0.13.0 2.2.x May 10, 2020
0.12.0 2.1.x Feb 28, 2020
0.11.0 2.1.x Jan 10, 2020
0.10.0 2.0.x Dec 05, 2019
0.9.1 2.0.x Nov 15, 2019
0.9.0 2.0.x Oct 18, 2019
0.8.1 1.15.x Nov 15, 2019
0.8.0 1.15.x Oct 17, 2019
0.7.2 1.14.x Nov 15, 2019
0.7.1 1.14.x Oct 18, 2019
0.7.0 1.14.x Jul 14, 2019
0.6.0 1.13.x May 29, 2019
0.5.0 1.13.x Apr 12, 2019
0.4.0 1.13.x Mar 01, 2019
0.3.0 1.12.0 Feb 15, 2019
0.2.0 1.12.0 Jan 29, 2019
0.1.0 1.12.0 Dec 16, 2018

Performance Benchmarking

We use github-pages to document the results of API performance benchmarks. The benchmark job is triggered on every commit to master branch and facilitates tracking performance w.r.t commits.

Contributing

Tensorflow I/O is a community led open source project. As such, the project depends on public contributions, bug-fixes, and documentation. Please see:

Build Status and CI

Build Status
Linux CPU Python 2 Status
Linux CPU Python 3 Status
Linux GPU Python 2 Status
Linux GPU Python 3 Status

Because of manylinux2010 requirement, TensorFlow I/O is built with Ubuntu:16.04 + Developer Toolset 7 (GCC 7.3) on Linux. Configuration with Ubuntu 16.04 with Developer Toolset 7 is not exactly straightforward. If the system have docker installed, then the following command will automatically build manylinux2010 compatible whl package:

#!/usr/bin/env bash

ls dist/*
for f in dist/*.whl; do
  docker run -i --rm -v $PWD:/v -w /v --net=host quay.io/pypa/manylinux2010_x86_64 bash -x -e /v/tools/build/auditwheel repair --plat manylinux2010_x86_64 $f
done
sudo chown -R $(id -nu):$(id -ng) .
ls wheelhouse/*

It takes some time to build, but once complete, there will be python 3.5, 3.6, 3.7 compatible whl packages available in wheelhouse directory.

On macOS, the same command could be used. However, the script expects python in shell and will only generate a whl package that matches the version of python in shell. If you want to build a whl package for a specific python then you have to alias this version of python to python in shell. See .github/workflows/build.yml Auditwheel step for instructions how to do that.

Note the above command is also the command we use when releasing packages for Linux and macOS.

TensorFlow I/O uses both GitHub Workflows and Google CI (Kokoro) for continuous integration. GitHub Workflows is used for macOS build and test. Kokoro is used for Linux build and test. Again, because of the manylinux2010 requirement, on Linux whl packages are always built with Ubuntu 16.04 + Developer Toolset 7. Tests are done on a variatiy of systems with different python3 versions to ensure a good coverage:

Python Ubuntu 18.04 Ubuntu 20.04 macOS + osx9 Windows-2019
2.7 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: N/A
3.7 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: :heavy_check_mark:
3.8 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: :heavy_check_mark:

TensorFlow I/O has integrations with many systems and cloud vendors such as Prometheus, Apache Kafka, Apache Ignite, Google Cloud PubSub, AWS Kinesis, Microsoft Azure Storage, Alibaba Cloud OSS etc.

We tried our best to test against those systems in our continuous integration whenever possible. Some tests such as Prometheus, Kafka, and Ignite are done with live systems, meaning we install Prometheus/Kafka/Ignite on CI machine before the test is run. Some tests such as Kinesis, PubSub, and Azure Storage are done through official or non-official emulators. Offline tests are also performed whenever possible, though systems covered through offine tests may not have the same level of coverage as live systems or emulators.

Live System Emulator CI Integration Offline
Apache Kafka :heavy_check_mark: :heavy_check_mark:
Apache Ignite :heavy_check_mark: :heavy_check_mark:
Prometheus :heavy_check_mark: :heavy_check_mark:
Google PubSub :heavy_check_mark: :heavy_check_mark:
Azure Storage :heavy_check_mark: :heavy_check_mark:
AWS Kinesis :heavy_check_mark: :heavy_check_mark:
Alibaba Cloud OSS :heavy_check_mark:
Google BigTable/BigQuery to be added
Elasticsearch (experimental) :heavy_check_mark: :heavy_check_mark:
MongoDB (experimental) :heavy_check_mark: :heavy_check_mark:

References for emulators:

Community

Additional Information

License

Apache License 2.0

Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

tensorflow_io_nightly-0.24.0.dev20220124025738-cp310-cp310-win_amd64.whl (21.4 MB view details)

Uploaded CPython 3.10 Windows x86-64

tensorflow_io_nightly-0.24.0.dev20220124025738-cp310-cp310-macosx_10_14_x86_64.whl (23.8 MB view details)

Uploaded CPython 3.10 macOS 10.14+ x86-64

tensorflow_io_nightly-0.24.0.dev20220124025738-cp39-cp39-win_amd64.whl (21.4 MB view details)

Uploaded CPython 3.9 Windows x86-64

tensorflow_io_nightly-0.24.0.dev20220124025738-cp39-cp39-macosx_10_14_x86_64.whl (23.8 MB view details)

Uploaded CPython 3.9 macOS 10.14+ x86-64

tensorflow_io_nightly-0.24.0.dev20220124025738-cp38-cp38-win_amd64.whl (21.4 MB view details)

Uploaded CPython 3.8 Windows x86-64

tensorflow_io_nightly-0.24.0.dev20220124025738-cp38-cp38-macosx_10_14_x86_64.whl (23.8 MB view details)

Uploaded CPython 3.8 macOS 10.14+ x86-64

tensorflow_io_nightly-0.24.0.dev20220124025738-cp37-cp37m-win_amd64.whl (21.4 MB view details)

Uploaded CPython 3.7m Windows x86-64

tensorflow_io_nightly-0.24.0.dev20220124025738-cp37-cp37m-macosx_10_14_x86_64.whl (23.8 MB view details)

Uploaded CPython 3.7m macOS 10.14+ x86-64

File details

Details for the file tensorflow_io_nightly-0.24.0.dev20220124025738-cp310-cp310-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.24.0.dev20220124025738-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 dd51fc3e7e128071c57c9e581f4188fce11ce64e9f130f0ec95f204e8be412ea
MD5 10aa3fa96d4a4eae1f321172018e70e2
BLAKE2b-256 2198da00969318cbe670b886b2bdb14d7458ff72215c8298fc3cbda415e83065

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.24.0.dev20220124025738-cp310-cp310-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.24.0.dev20220124025738-cp310-cp310-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 4a1a33bd50e01fb4f03faec984b4fd86687fd314e1b743baa8a6dd93bd43eef0
MD5 f905319d0d2941a022948e39fcd05b45
BLAKE2b-256 58db4139ec90147a5cc44a72c67270d5636460dd8745ade0c7911780fddc78d2

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.24.0.dev20220124025738-cp310-cp310-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.24.0.dev20220124025738-cp310-cp310-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 bb2e6bd4cdcbd51b4e0b10c05e9a9b2db86ec011095ef31e16d9eef690da9d09
MD5 b5c69a34843aa715470e2e61effd9ba7
BLAKE2b-256 970adc2964501f8ab49d40d8515241a7660e1e6dbd7182a6d1555d61bafdcc36

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.24.0.dev20220124025738-cp39-cp39-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.24.0.dev20220124025738-cp39-cp39-win_amd64.whl
Algorithm Hash digest
SHA256 ae937b878b44c9a398be79c4929124d9290a1cfdb09170b96aafcd094818cf83
MD5 6954b5d50b27c413504f694ca7931a03
BLAKE2b-256 63f5532112771098b800f5a0bb0ec5a17d1ea088bcff13afde0679f681f69d07

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.24.0.dev20220124025738-cp39-cp39-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.24.0.dev20220124025738-cp39-cp39-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 ba7366c3cdf50bb07d0511414727929c279563aade1961ac885cbf958e2ea67b
MD5 3bbe722aa7a8e06a3ac84c46c69abc25
BLAKE2b-256 6dc124fce3b28647a59f7a4247114ee3b00a39fa4a212b42034d77c53a4bc435

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.24.0.dev20220124025738-cp39-cp39-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.24.0.dev20220124025738-cp39-cp39-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 de8a72e8a0d35d078925d1b4958eca159b50862a16b903c8b40934bc5e3c1b12
MD5 87573aba49b00ee21185b7100bda0558
BLAKE2b-256 11ded4ff19d60ccd7b44f8c1410f119f204c5bb2141a37e520ac057a1666a138

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.24.0.dev20220124025738-cp38-cp38-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.24.0.dev20220124025738-cp38-cp38-win_amd64.whl
Algorithm Hash digest
SHA256 7bea899729bf4411dee2d68d651fb1c66c36793f5fd75b6fa84995a6f5745e4f
MD5 db87a8899f578685a5b58b37b76daaa8
BLAKE2b-256 57a0c5ebf7644ad16c43e3565669da971b66965f03b70ce139d8d9634c2b7539

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.24.0.dev20220124025738-cp38-cp38-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.24.0.dev20220124025738-cp38-cp38-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 cd49d54fc3f41149191823aae8a17df889280c39554f973e99883989c35c085b
MD5 ebdd45afc9e1def379c81a26a52bb4c7
BLAKE2b-256 f3269f966cf335046e52a5ebc3a01be12aab282b21e213cf61b2a107df508594

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.24.0.dev20220124025738-cp38-cp38-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.24.0.dev20220124025738-cp38-cp38-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 1879c91a3660f6d8b12b070b376f8b65fb2a32a7a42c158e3650aa9259b4af10
MD5 633533176e38a56739b72d1663566c7e
BLAKE2b-256 86e7d37415b40c023cec2c1a3e007b6b982b2b0b16aaef03cde3a3daa276360c

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.24.0.dev20220124025738-cp37-cp37m-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.24.0.dev20220124025738-cp37-cp37m-win_amd64.whl
Algorithm Hash digest
SHA256 9bd283dad818f5a03402fe7e151e4c9abd37afb6bdfac59f852efba5adc99956
MD5 aea39d3461f5d52e92d9d1d6d0b675fa
BLAKE2b-256 22b0ee6c18e7d5820492606eea9553d9b9d61e189c971c1951c785f0f5f776fc

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.24.0.dev20220124025738-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.24.0.dev20220124025738-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 fb8995b2dc6942600be4c7e68bbafd4af3643c022eb15a1975ae148147d53a1c
MD5 3f9012015412dae3caa6149899e2c439
BLAKE2b-256 00171a891114a46cb864edfaf89c50ad380fe816e4317a818c6170fe2c55ffc4

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.24.0.dev20220124025738-cp37-cp37m-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.24.0.dev20220124025738-cp37-cp37m-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 0ecb04cd063f0b0def862c0051d430189b0ba20974ac5f88dd2cf414568d025a
MD5 db4f10d482450b2e2cca6260a6dbbe34
BLAKE2b-256 7e22c583ffd4423dafc6a2fb8c5ee2732cade481db48f7f8663713c9335479e0

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page