Skip to main content

TensorFlow IO

Project description




TensorFlow I/O

GitHub CI PyPI License Documentation

TensorFlow I/O is a collection of file systems and file formats that are not available in TensorFlow's built-in support. A full list of supported file systems and file formats by TensorFlow I/O can be found here.

The use of tensorflow-io is straightforward with keras. Below is an example to Get Started with TensorFlow with the data processing aspect replaced by tensorflow-io:

import tensorflow as tf
import tensorflow_io as tfio

# Read the MNIST data into the IODataset.
dataset_url = "https://storage.googleapis.com/cvdf-datasets/mnist/"
d_train = tfio.IODataset.from_mnist(
    dataset_url + "train-images-idx3-ubyte.gz",
    dataset_url + "train-labels-idx1-ubyte.gz",
)

# Shuffle the elements of the dataset.
d_train = d_train.shuffle(buffer_size=1024)

# By default image data is uint8, so convert to float32 using map().
d_train = d_train.map(lambda x, y: (tf.image.convert_image_dtype(x, tf.float32), y))

# prepare batches the data just like any other tf.data.Dataset
d_train = d_train.batch(32)

# Build the model.
model = tf.keras.models.Sequential(
    [
        tf.keras.layers.Flatten(input_shape=(28, 28)),
        tf.keras.layers.Dense(512, activation=tf.nn.relu),
        tf.keras.layers.Dropout(0.2),
        tf.keras.layers.Dense(10, activation=tf.nn.softmax),
    ]
)

# Compile the model.
model.compile(
    optimizer="adam", loss="sparse_categorical_crossentropy", metrics=["accuracy"]
)

# Fit the model.
model.fit(d_train, epochs=5, steps_per_epoch=200)

In the above MNIST example, the URL's to access the dataset files are passed directly to the tfio.IODataset.from_mnist API call. This is due to the inherent support that tensorflow-io provides for HTTP/HTTPS file system, thus eliminating the need for downloading and saving datasets on a local directory.

NOTE: Since tensorflow-io is able to detect and uncompress the MNIST dataset automatically if needed, we can pass the URL's for the compressed files (gzip) to the API call as is.

Please check the official documentation for more detailed and interesting usages of the package.

Installation

Python Package

The tensorflow-io Python package can be installed with pip directly using:

$ pip install tensorflow-io

People who are a little more adventurous can also try our nightly binaries:

$ pip install tensorflow-io-nightly

To ensure you have a version of TensorFlow that is compatible with TensorFlow-IO, you can specify the tensorflow extra requirement during install:

pip install tensorflow-io[tensorflow]

Similar extras exist for the tensorflow-gpu, tensorflow-cpu and tensorflow-rocm packages.

Docker Images

In addition to the pip packages, the docker images can be used to quickly get started.

For stable builds:

$ docker pull tfsigio/tfio:latest
$ docker run -it --rm --name tfio-latest tfsigio/tfio:latest

For nightly builds:

$ docker pull tfsigio/tfio:nightly
$ docker run -it --rm --name tfio-nightly tfsigio/tfio:nightly

R Package

Once the tensorflow-io Python package has been successfully installed, you can install the development version of the R package from GitHub via the following:

if (!require("remotes")) install.packages("remotes")
remotes::install_github("tensorflow/io", subdir = "R-package")

TensorFlow Version Compatibility

To ensure compatibility with TensorFlow, it is recommended to install a matching version of TensorFlow I/O according to the table below. You can find the list of releases here.

TensorFlow I/O Version TensorFlow Compatibility Release Date
0.23.0 2.7.x Dec 14, 2021
0.22.0 2.7.x Nov 10, 2021
0.21.0 2.6.x Sep 12, 2021
0.20.0 2.6.x Aug 11, 2021
0.19.1 2.5.x Jul 25, 2021
0.19.0 2.5.x Jun 25, 2021
0.18.0 2.5.x May 13, 2021
0.17.1 2.4.x Apr 16, 2021
0.17.0 2.4.x Dec 14, 2020
0.16.0 2.3.x Oct 23, 2020
0.15.0 2.3.x Aug 03, 2020
0.14.0 2.2.x Jul 08, 2020
0.13.0 2.2.x May 10, 2020
0.12.0 2.1.x Feb 28, 2020
0.11.0 2.1.x Jan 10, 2020
0.10.0 2.0.x Dec 05, 2019
0.9.1 2.0.x Nov 15, 2019
0.9.0 2.0.x Oct 18, 2019
0.8.1 1.15.x Nov 15, 2019
0.8.0 1.15.x Oct 17, 2019
0.7.2 1.14.x Nov 15, 2019
0.7.1 1.14.x Oct 18, 2019
0.7.0 1.14.x Jul 14, 2019
0.6.0 1.13.x May 29, 2019
0.5.0 1.13.x Apr 12, 2019
0.4.0 1.13.x Mar 01, 2019
0.3.0 1.12.0 Feb 15, 2019
0.2.0 1.12.0 Jan 29, 2019
0.1.0 1.12.0 Dec 16, 2018

Performance Benchmarking

We use github-pages to document the results of API performance benchmarks. The benchmark job is triggered on every commit to master branch and facilitates tracking performance w.r.t commits.

Contributing

Tensorflow I/O is a community led open source project. As such, the project depends on public contributions, bug-fixes, and documentation. Please see:

Build Status and CI

Build Status
Linux CPU Python 2 Status
Linux CPU Python 3 Status
Linux GPU Python 2 Status
Linux GPU Python 3 Status

Because of manylinux2010 requirement, TensorFlow I/O is built with Ubuntu:16.04 + Developer Toolset 7 (GCC 7.3) on Linux. Configuration with Ubuntu 16.04 with Developer Toolset 7 is not exactly straightforward. If the system have docker installed, then the following command will automatically build manylinux2010 compatible whl package:

#!/usr/bin/env bash

ls dist/*
for f in dist/*.whl; do
  docker run -i --rm -v $PWD:/v -w /v --net=host quay.io/pypa/manylinux2010_x86_64 bash -x -e /v/tools/build/auditwheel repair --plat manylinux2010_x86_64 $f
done
sudo chown -R $(id -nu):$(id -ng) .
ls wheelhouse/*

It takes some time to build, but once complete, there will be python 3.5, 3.6, 3.7 compatible whl packages available in wheelhouse directory.

On macOS, the same command could be used. However, the script expects python in shell and will only generate a whl package that matches the version of python in shell. If you want to build a whl package for a specific python then you have to alias this version of python to python in shell. See .github/workflows/build.yml Auditwheel step for instructions how to do that.

Note the above command is also the command we use when releasing packages for Linux and macOS.

TensorFlow I/O uses both GitHub Workflows and Google CI (Kokoro) for continuous integration. GitHub Workflows is used for macOS build and test. Kokoro is used for Linux build and test. Again, because of the manylinux2010 requirement, on Linux whl packages are always built with Ubuntu 16.04 + Developer Toolset 7. Tests are done on a variatiy of systems with different python3 versions to ensure a good coverage:

Python Ubuntu 18.04 Ubuntu 20.04 macOS + osx9 Windows-2019
2.7 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: N/A
3.7 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: :heavy_check_mark:
3.8 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: :heavy_check_mark:

TensorFlow I/O has integrations with many systems and cloud vendors such as Prometheus, Apache Kafka, Apache Ignite, Google Cloud PubSub, AWS Kinesis, Microsoft Azure Storage, Alibaba Cloud OSS etc.

We tried our best to test against those systems in our continuous integration whenever possible. Some tests such as Prometheus, Kafka, and Ignite are done with live systems, meaning we install Prometheus/Kafka/Ignite on CI machine before the test is run. Some tests such as Kinesis, PubSub, and Azure Storage are done through official or non-official emulators. Offline tests are also performed whenever possible, though systems covered through offine tests may not have the same level of coverage as live systems or emulators.

Live System Emulator CI Integration Offline
Apache Kafka :heavy_check_mark: :heavy_check_mark:
Apache Ignite :heavy_check_mark: :heavy_check_mark:
Prometheus :heavy_check_mark: :heavy_check_mark:
Google PubSub :heavy_check_mark: :heavy_check_mark:
Azure Storage :heavy_check_mark: :heavy_check_mark:
AWS Kinesis :heavy_check_mark: :heavy_check_mark:
Alibaba Cloud OSS :heavy_check_mark:
Google BigTable/BigQuery to be added
Elasticsearch (experimental) :heavy_check_mark: :heavy_check_mark:
MongoDB (experimental) :heavy_check_mark: :heavy_check_mark:

References for emulators:

Community

Additional Information

License

Apache License 2.0

Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

tensorflow_io_nightly-0.23.0.dev20211214214458-cp310-cp310-win_amd64.whl (21.6 MB view details)

Uploaded CPython 3.10 Windows x86-64

tensorflow_io_nightly-0.23.0.dev20211214214458-cp310-cp310-macosx_10_14_x86_64.whl (23.8 MB view details)

Uploaded CPython 3.10 macOS 10.14+ x86-64

tensorflow_io_nightly-0.23.0.dev20211214214458-cp39-cp39-win_amd64.whl (21.6 MB view details)

Uploaded CPython 3.9 Windows x86-64

tensorflow_io_nightly-0.23.0.dev20211214214458-cp39-cp39-macosx_10_14_x86_64.whl (23.8 MB view details)

Uploaded CPython 3.9 macOS 10.14+ x86-64

tensorflow_io_nightly-0.23.0.dev20211214214458-cp38-cp38-win_amd64.whl (21.6 MB view details)

Uploaded CPython 3.8 Windows x86-64

tensorflow_io_nightly-0.23.0.dev20211214214458-cp38-cp38-macosx_10_14_x86_64.whl (23.8 MB view details)

Uploaded CPython 3.8 macOS 10.14+ x86-64

tensorflow_io_nightly-0.23.0.dev20211214214458-cp37-cp37m-win_amd64.whl (21.6 MB view details)

Uploaded CPython 3.7m Windows x86-64

tensorflow_io_nightly-0.23.0.dev20211214214458-cp37-cp37m-macosx_10_14_x86_64.whl (23.8 MB view details)

Uploaded CPython 3.7m macOS 10.14+ x86-64

File details

Details for the file tensorflow_io_nightly-0.23.0.dev20211214214458-cp310-cp310-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.23.0.dev20211214214458-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 a259ac66b9ec08c30b3c127f4c1b14c495d8bba39f937193b577b4b9339d9cf3
MD5 10384b3d9f0fff9aef6733256b322326
BLAKE2b-256 65783a4e347d547954df859efaccb635429705fb20939ad28457f037733759ba

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.23.0.dev20211214214458-cp310-cp310-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.23.0.dev20211214214458-cp310-cp310-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 3d92df24eff7de33413a400f00137e8cba76f46ef45843e5d1cf1c686ce3d2d9
MD5 4de09ca1ad6b2a6554281d88c36b7d13
BLAKE2b-256 a28bc7f326064e274fe94907de06e882d6d095444128e3ea978694d075e4276d

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.23.0.dev20211214214458-cp310-cp310-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.23.0.dev20211214214458-cp310-cp310-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 f341c461b390864dd73e585d093e5323fc7c1c2f3ea406e0c99556664de674cc
MD5 d2199e7a6fbf658d6584432f8bf7440b
BLAKE2b-256 7d2c0b9d0aafe4f2e145a0b9713b3bcde927684fb62fe2f972bae7fe5e82f035

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.23.0.dev20211214214458-cp39-cp39-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.23.0.dev20211214214458-cp39-cp39-win_amd64.whl
Algorithm Hash digest
SHA256 ce0a573de4a7e87d7df8ae12a5e2f6a0e4d0a82613e2db1b89dc155f3f8b1fc8
MD5 3579669292e943d61816c2011aacd7c9
BLAKE2b-256 6737197acce24777ba555b63687cce8d63b9410ce9bcfdfeb3fa51f702de4f86

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.23.0.dev20211214214458-cp39-cp39-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.23.0.dev20211214214458-cp39-cp39-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 95837e694dd6d1f8cc9493bc5f4a972b5606dd6f10657f5bb57f9bccab16e556
MD5 af55cb64876832ac8d923a247c81c7f0
BLAKE2b-256 37e7f20ad41942669112b44b71a6648eb08819ea1fa599be8f8b3f64815ae319

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.23.0.dev20211214214458-cp39-cp39-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.23.0.dev20211214214458-cp39-cp39-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 c3aec2218cfb3052a9f5d4119e36deca829220de5df88a75ad5209af0f20efe8
MD5 ff3302680a2dfff8737b89b3f29f5f7a
BLAKE2b-256 174624f79b4d09869a5c80d33031120de884441df4dda7c16b72bb057e2e3f6b

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.23.0.dev20211214214458-cp38-cp38-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.23.0.dev20211214214458-cp38-cp38-win_amd64.whl
Algorithm Hash digest
SHA256 c2f6095b809ba9ae1940e00d254c43332e03596ecc2a02e06920ca34d597b92c
MD5 b3b908e8ad5c322a6284b3dc9693870f
BLAKE2b-256 730abf3e907b2da539e6cb5b50b13fe475f318844de06eee3b92ee81e1ba7643

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.23.0.dev20211214214458-cp38-cp38-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.23.0.dev20211214214458-cp38-cp38-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 bd5a93d22b7884d6677e043fb7e5af3602a71988aa67294f5024ecb7c9331d22
MD5 627a9057e0698614e9dd0f917ea27707
BLAKE2b-256 8e94e4517f610287d05593486224796baf4d5f6bcec0b7bbc23f6b4dffe3cc23

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.23.0.dev20211214214458-cp38-cp38-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.23.0.dev20211214214458-cp38-cp38-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 2c05002f1338cc8520b2fbb56a632831f748493e22370359e2d6528b096e30e6
MD5 74e424e53f25f0ce150cd2995d630c5a
BLAKE2b-256 a727ba9ac52e43d219329a47c6ba313d2e68b305d8c85c323a65db6c0d0b6291

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.23.0.dev20211214214458-cp37-cp37m-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.23.0.dev20211214214458-cp37-cp37m-win_amd64.whl
Algorithm Hash digest
SHA256 d2c642294724d50af9917bb862630a4ec35f1749b76e021c37885917760e2940
MD5 52dac2def52de5efcaf0455d38db488f
BLAKE2b-256 379042b1adf954155882cec6f9b94c587100084ac769d9989593e4e60411b8b9

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.23.0.dev20211214214458-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.23.0.dev20211214214458-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 5fb1461f8b676e164c326648ddbe555c6d010b82ae52763136a7fc3f27d8bfe5
MD5 a49d5b3b3c9cc7ef1685da716f2338c3
BLAKE2b-256 4ade75fca10be90026152b880367181b01c2c722d6e2327a6de2ce4596817ec0

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.23.0.dev20211214214458-cp37-cp37m-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.23.0.dev20211214214458-cp37-cp37m-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 76a223fbec2ac7691fc3ad0357bb7356ac71061773e2039bd0cabc3dc9dc88a2
MD5 32c7c7570357ef0e4d16923043c51b6b
BLAKE2b-256 3e21b4e7e415b66ed2a16f2c1b40af82ff7a14fe54fde645a6d41a33d08cad22

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page