Skip to main content

TensorFlow IO

Project description




TensorFlow I/O

GitHub CI PyPI License Documentation

TensorFlow I/O is a collection of file systems and file formats that are not available in TensorFlow's built-in support. A full list of supported file systems and file formats by TensorFlow I/O can be found here.

The use of tensorflow-io is straightforward with keras. Below is an example to Get Started with TensorFlow with the data processing aspect replaced by tensorflow-io:

import tensorflow as tf
import tensorflow_io as tfio

# Read the MNIST data into the IODataset.
dataset_url = "https://storage.googleapis.com/cvdf-datasets/mnist/"
d_train = tfio.IODataset.from_mnist(
    dataset_url + "train-images-idx3-ubyte.gz",
    dataset_url + "train-labels-idx1-ubyte.gz",
)

# Shuffle the elements of the dataset.
d_train = d_train.shuffle(buffer_size=1024)

# By default image data is uint8, so convert to float32 using map().
d_train = d_train.map(lambda x, y: (tf.image.convert_image_dtype(x, tf.float32), y))

# prepare batches the data just like any other tf.data.Dataset
d_train = d_train.batch(32)

# Build the model.
model = tf.keras.models.Sequential(
    [
        tf.keras.layers.Flatten(input_shape=(28, 28)),
        tf.keras.layers.Dense(512, activation=tf.nn.relu),
        tf.keras.layers.Dropout(0.2),
        tf.keras.layers.Dense(10, activation=tf.nn.softmax),
    ]
)

# Compile the model.
model.compile(
    optimizer="adam", loss="sparse_categorical_crossentropy", metrics=["accuracy"]
)

# Fit the model.
model.fit(d_train, epochs=5, steps_per_epoch=200)

In the above MNIST example, the URL's to access the dataset files are passed directly to the tfio.IODataset.from_mnist API call. This is due to the inherent support that tensorflow-io provides for HTTP/HTTPS file system, thus eliminating the need for downloading and saving datasets on a local directory.

NOTE: Since tensorflow-io is able to detect and uncompress the MNIST dataset automatically if needed, we can pass the URL's for the compressed files (gzip) to the API call as is.

Please check the official documentation for more detailed and interesting usages of the package.

Installation

Python Package

The tensorflow-io Python package can be installed with pip directly using:

$ pip install tensorflow-io

People who are a little more adventurous can also try our nightly binaries:

$ pip install tensorflow-io-nightly

Docker Images

In addition to the pip packages, the docker images can be used to quickly get started.

For stable builds:

$ docker pull tfsigio/tfio:latest
$ docker run -it --rm --name tfio-latest tfsigio/tfio:latest

For nightly builds:

$ docker pull tfsigio/tfio:nightly
$ docker run -it --rm --name tfio-nightly tfsigio/tfio:nightly

R Package

Once the tensorflow-io Python package has been successfully installed, you can install the development version of the R package from GitHub via the following:

if (!require("remotes")) install.packages("remotes")
remotes::install_github("tensorflow/io", subdir = "R-package")

TensorFlow Version Compatibility

To ensure compatibility with TensorFlow, it is recommended to install a matching version of TensorFlow I/O according to the table below. You can find the list of releases here.

TensorFlow I/O Version TensorFlow Compatibility Release Date
0.17.1 2.4.x Apr 16, 2021
0.17.0 2.4.x Dec 14, 2020
0.16.0 2.3.x Oct 23, 2020
0.15.0 2.3.x Aug 03, 2020
0.14.0 2.2.x Jul 08, 2020
0.13.0 2.2.x May 10, 2020
0.12.0 2.1.x Feb 28, 2020
0.11.0 2.1.x Jan 10, 2020
0.10.0 2.0.x Dec 05, 2019
0.9.1 2.0.x Nov 15, 2019
0.9.0 2.0.x Oct 18, 2019
0.8.1 1.15.x Nov 15, 2019
0.8.0 1.15.x Oct 17, 2019
0.7.2 1.14.x Nov 15, 2019
0.7.1 1.14.x Oct 18, 2019
0.7.0 1.14.x Jul 14, 2019
0.6.0 1.13.x May 29, 2019
0.5.0 1.13.x Apr 12, 2019
0.4.0 1.13.x Mar 01, 2019
0.3.0 1.12.0 Feb 15, 2019
0.2.0 1.12.0 Jan 29, 2019
0.1.0 1.12.0 Dec 16, 2018

Performance Benchmarking

We use github-pages to document the results of API performance benchmarks. The benchmark job is triggered on every commit to master branch and facilitates tracking performance w.r.t commits.

Contributing

Tensorflow I/O is a community led open source project. As such, the project depends on public contributions, bug-fixes, and documentation. Please see:

Build Status and CI

Build Status
Linux CPU Python 2 Status
Linux CPU Python 3 Status
Linux GPU Python 2 Status
Linux GPU Python 3 Status

Because of manylinux2010 requirement, TensorFlow I/O is built with Ubuntu:16.04 + Developer Toolset 7 (GCC 7.3) on Linux. Configuration with Ubuntu 16.04 with Developer Toolset 7 is not exactly straightforward. If the system have docker installed, then the following command will automatically build manylinux2010 compatible whl package:

#!/usr/bin/env bash

ls dist/*
for f in dist/*.whl; do
  docker run -i --rm -v $PWD:/v -w /v --net=host quay.io/pypa/manylinux2010_x86_64 bash -x -e /v/tools/build/auditwheel repair --plat manylinux2010_x86_64 $f
done
sudo chown -R $(id -nu):$(id -ng) .
ls wheelhouse/*

It takes some time to build, but once complete, there will be python 3.5, 3.6, 3.7 compatible whl packages available in wheelhouse directory.

On macOS, the same command could be used. However, the script expects python in shell and will only generate a whl package that matches the version of python in shell. If you want to build a whl package for a specific python then you have to alias this version of python to python in shell. See .github/workflows/build.yml Auditwheel step for instructions how to do that.

Note the above command is also the command we use when releasing packages for Linux and macOS.

TensorFlow I/O uses both GitHub Workflows and Google CI (Kokoro) for continuous integration. GitHub Workflows is used for macOS build and test. Kokoro is used for Linux build and test. Again, because of the manylinux2010 requirement, on Linux whl packages are always built with Ubuntu 16.04 + Developer Toolset 7. Tests are done on a variatiy of systems with different python3 versions to ensure a good coverage:

Python Ubuntu 18.04 Ubuntu 20.04 macOS + osx9 Windows-2019
2.7 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: N/A
3.7 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: :heavy_check_mark:
3.8 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: :heavy_check_mark:

TensorFlow I/O has integrations with many systems and cloud vendors such as Prometheus, Apache Kafka, Apache Ignite, Google Cloud PubSub, AWS Kinesis, Microsoft Azure Storage, Alibaba Cloud OSS etc.

We tried our best to test against those systems in our continuous integration whenever possible. Some tests such as Prometheus, Kafka, and Ignite are done with live systems, meaning we install Prometheus/Kafka/Ignite on CI machine before the test is run. Some tests such as Kinesis, PubSub, and Azure Storage are done through official or non-official emulators. Offline tests are also performed whenever possible, though systems covered through offine tests may not have the same level of coverage as live systems or emulators.

Live System Emulator CI Integration Offline
Apache Kafka :heavy_check_mark: :heavy_check_mark:
Apache Ignite :heavy_check_mark: :heavy_check_mark:
Prometheus :heavy_check_mark: :heavy_check_mark:
Google PubSub :heavy_check_mark: :heavy_check_mark:
Azure Storage :heavy_check_mark: :heavy_check_mark:
AWS Kinesis :heavy_check_mark: :heavy_check_mark:
Alibaba Cloud OSS :heavy_check_mark:
Google BigTable/BigQuery to be added
Elasticsearch (experimental) :heavy_check_mark: :heavy_check_mark:
MongoDB (experimental) :heavy_check_mark: :heavy_check_mark:

References for emulators:

Community

Additional Information

License

Apache License 2.0

Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

tensorflow_io_nightly-0.18.0.dev20210514080232-cp39-cp39-win_amd64.whl (20.6 MB view details)

Uploaded CPython 3.9 Windows x86-64

tensorflow_io_nightly-0.18.0.dev20210514080232-cp39-cp39-macosx_10_14_x86_64.whl (21.1 MB view details)

Uploaded CPython 3.9 macOS 10.14+ x86-64

tensorflow_io_nightly-0.18.0.dev20210514080232-cp38-cp38-win_amd64.whl (20.6 MB view details)

Uploaded CPython 3.8 Windows x86-64

tensorflow_io_nightly-0.18.0.dev20210514080232-cp38-cp38-macosx_10_14_x86_64.whl (21.1 MB view details)

Uploaded CPython 3.8 macOS 10.14+ x86-64

tensorflow_io_nightly-0.18.0.dev20210514080232-cp37-cp37m-win_amd64.whl (20.6 MB view details)

Uploaded CPython 3.7m Windows x86-64

tensorflow_io_nightly-0.18.0.dev20210514080232-cp37-cp37m-macosx_10_14_x86_64.whl (21.1 MB view details)

Uploaded CPython 3.7m macOS 10.14+ x86-64

tensorflow_io_nightly-0.18.0.dev20210514080232-cp36-cp36m-win_amd64.whl (20.6 MB view details)

Uploaded CPython 3.6m Windows x86-64

tensorflow_io_nightly-0.18.0.dev20210514080232-cp36-cp36m-macosx_10_14_x86_64.whl (21.1 MB view details)

Uploaded CPython 3.6m macOS 10.14+ x86-64

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210514080232-cp39-cp39-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210514080232-cp39-cp39-win_amd64.whl
Algorithm Hash digest
SHA256 909551943333f3261cb17c9e282938fd07c6ad299bd27818d6899c49591f8055
MD5 38d8e5844ee00f8a139fb6fa35e90202
BLAKE2b-256 797233c2967a89c6c5bc23e2422a6852fcdebcb3fe3b678e712387433382a343

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210514080232-cp39-cp39-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210514080232-cp39-cp39-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 9ddfc294d733fc69903fddbb07eac2d6d11321828b0ad985a5382eed52b7f3b2
MD5 f21fa0deaf0bc9791e7f6778e7d9dca0
BLAKE2b-256 1de00004e9588f64cab0a40a74266edf018085be7bd9094a9fea3bbbbc6afb6d

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210514080232-cp39-cp39-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210514080232-cp39-cp39-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 53d9a73083a72e53f6a2aa4c0d903af74a1b2003c6314f4796dcb145731c4a6b
MD5 b459e2f8a121f7657dcd4836c3e1f0f8
BLAKE2b-256 9e402c7b81729e48894eb3caa9e2da9e5a46feb49b04932bcb524be78d909f24

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210514080232-cp38-cp38-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210514080232-cp38-cp38-win_amd64.whl
Algorithm Hash digest
SHA256 2b000a482c30abaec0e0489047a6e8df1ea078ae4e2d9ada8347162fbab041a5
MD5 bc7cee28a6a3b4707fc039a2af6ac36c
BLAKE2b-256 d9a373e7e782ec93405b782c4f7ded1620ebbc7e4b45daadfe66072db0a13ba2

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210514080232-cp38-cp38-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210514080232-cp38-cp38-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 7f82303d342232d33a159e44af59d28cd695fbc1c547dbb914b5c7a336372e18
MD5 c4ffa6e900a3effb84c6d41ffa10412c
BLAKE2b-256 7f3769c1c5c20063e4265cbe2940b6508e1fe185c9f939fff59dfea71a0dace7

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210514080232-cp38-cp38-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210514080232-cp38-cp38-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 de9c505432ddba181944f0dc869372b75caabc4c5c1cfa70348807cd3083506d
MD5 9e1a5ca32316cc8a34f1afe3737bfb2c
BLAKE2b-256 129184227a3ac0b0844e06e349f876f83dadf606a28fe8e82b65c7fda07a4e78

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210514080232-cp37-cp37m-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210514080232-cp37-cp37m-win_amd64.whl
Algorithm Hash digest
SHA256 fbeda8be1de5bea95106f02843736151ebbcb54dad88b5d12f39425984613de8
MD5 76a7e2cb494e69eb7e4d144c3d018ff7
BLAKE2b-256 2259f4b06d96d7d92ba2c3201cbebf82e696b05d258efb4f8d8886930b5eb1fd

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210514080232-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210514080232-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 81f829576f8d7fc491472dbc5ac9bc11cec8455c5ab822cc4e1747132d3223e3
MD5 7935156a06976fe531920df080788ab6
BLAKE2b-256 686b96e969f917f98328854c8b7c2cc9879b6656238317e5395e4e5064fffd5b

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210514080232-cp37-cp37m-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210514080232-cp37-cp37m-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 4de444f04a20a5f53311a0461db60fb4d6531e6e1ac6eebc4fbb359b1bff4759
MD5 ec548cb3ebc7a3541a844bb8be3f7a16
BLAKE2b-256 c8a01d954ba0bb1e0560607ba0eb6b507343867124d2a35b63f2db01f1c9bca1

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210514080232-cp36-cp36m-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210514080232-cp36-cp36m-win_amd64.whl
Algorithm Hash digest
SHA256 84e59cd4a80e10fb55af75f88483dfff4c8c970d473e5853c83803e16a2602f4
MD5 fe84fe7e888f100ff49b90daf9629cae
BLAKE2b-256 bb8c4decd3de8b03bc5b12154a4eb5aaee59b623dfd7b05a5af87297d2e1a50a

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210514080232-cp36-cp36m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210514080232-cp36-cp36m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 37949397af1415c16c58aa95ef3b6ebd5a5584d39c2c5c7b39ac95139268a3e4
MD5 2cb6547748532f04216e97babdff42b7
BLAKE2b-256 1dadd8038e8097fa35a845a4279163d9923ba0ed18f5921cdd2a5c4a292af829

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210514080232-cp36-cp36m-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210514080232-cp36-cp36m-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 35b8776109a0fb20c945e7eaf1d3c188d1ebdbf4721d6a1ea6c6dbe86a8d1346
MD5 2c7cde8760d9ddaf63b35386aaf86058
BLAKE2b-256 6624062c91f1d4430284974f527ef15fa02e43a77f3b7b2e6ec4840546d1475c

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page