Skip to main content

TensorFlow IO

Project description




TensorFlow I/O

GitHub CI PyPI License Documentation

TensorFlow I/O is a collection of file systems and file formats that are not available in TensorFlow's built-in support. A full list of supported file systems and file formats by TensorFlow I/O can be found here.

The use of tensorflow-io is straightforward with keras. Below is an example to Get Started with TensorFlow with the data processing aspect replaced by tensorflow-io:

import tensorflow as tf
import tensorflow_io as tfio

# Read the MNIST data into the IODataset.
dataset_url = "https://storage.googleapis.com/cvdf-datasets/mnist/"
d_train = tfio.IODataset.from_mnist(
    dataset_url + "train-images-idx3-ubyte.gz",
    dataset_url + "train-labels-idx1-ubyte.gz",
)

# Shuffle the elements of the dataset.
d_train = d_train.shuffle(buffer_size=1024)

# By default image data is uint8, so convert to float32 using map().
d_train = d_train.map(lambda x, y: (tf.image.convert_image_dtype(x, tf.float32), y))

# prepare batches the data just like any other tf.data.Dataset
d_train = d_train.batch(32)

# Build the model.
model = tf.keras.models.Sequential(
    [
        tf.keras.layers.Flatten(input_shape=(28, 28)),
        tf.keras.layers.Dense(512, activation=tf.nn.relu),
        tf.keras.layers.Dropout(0.2),
        tf.keras.layers.Dense(10, activation=tf.nn.softmax),
    ]
)

# Compile the model.
model.compile(
    optimizer="adam", loss="sparse_categorical_crossentropy", metrics=["accuracy"]
)

# Fit the model.
model.fit(d_train, epochs=5, steps_per_epoch=200)

In the above MNIST example, the URL's to access the dataset files are passed directly to the tfio.IODataset.from_mnist API call. This is due to the inherent support that tensorflow-io provides for HTTP/HTTPS file system, thus eliminating the need for downloading and saving datasets on a local directory.

NOTE: Since tensorflow-io is able to detect and uncompress the MNIST dataset automatically if needed, we can pass the URL's for the compressed files (gzip) to the API call as is.

Please check the official documentation for more detailed and interesting usages of the package.

Installation

Python Package

The tensorflow-io Python package can be installed with pip directly using:

$ pip install tensorflow-io

People who are a little more adventurous can also try our nightly binaries:

$ pip install tensorflow-io-nightly

Docker Images

In addition to the pip packages, the docker images can be used to quickly get started.

For stable builds:

$ docker pull tfsigio/tfio:latest
$ docker run -it --rm --name tfio-latest tfsigio/tfio:latest

For nightly builds:

$ docker pull tfsigio/tfio:nightly
$ docker run -it --rm --name tfio-nightly tfsigio/tfio:nightly

R Package

Once the tensorflow-io Python package has been successfully installed, you can install the development version of the R package from GitHub via the following:

if (!require("remotes")) install.packages("remotes")
remotes::install_github("tensorflow/io", subdir = "R-package")

TensorFlow Version Compatibility

To ensure compatibility with TensorFlow, it is recommended to install a matching version of TensorFlow I/O according to the table below. You can find the list of releases here.

TensorFlow I/O Version TensorFlow Compatibility Release Date
0.17.1 2.4.x Apr 16, 2021
0.17.0 2.4.x Dec 14, 2020
0.16.0 2.3.x Oct 23, 2020
0.15.0 2.3.x Aug 03, 2020
0.14.0 2.2.x Jul 08, 2020
0.13.0 2.2.x May 10, 2020
0.12.0 2.1.x Feb 28, 2020
0.11.0 2.1.x Jan 10, 2020
0.10.0 2.0.x Dec 05, 2019
0.9.1 2.0.x Nov 15, 2019
0.9.0 2.0.x Oct 18, 2019
0.8.1 1.15.x Nov 15, 2019
0.8.0 1.15.x Oct 17, 2019
0.7.2 1.14.x Nov 15, 2019
0.7.1 1.14.x Oct 18, 2019
0.7.0 1.14.x Jul 14, 2019
0.6.0 1.13.x May 29, 2019
0.5.0 1.13.x Apr 12, 2019
0.4.0 1.13.x Mar 01, 2019
0.3.0 1.12.0 Feb 15, 2019
0.2.0 1.12.0 Jan 29, 2019
0.1.0 1.12.0 Dec 16, 2018

Performance Benchmarking

We use github-pages to document the results of API performance benchmarks. The benchmark job is triggered on every commit to master branch and facilitates tracking performance w.r.t commits.

Contributing

Tensorflow I/O is a community led open source project. As such, the project depends on public contributions, bug-fixes, and documentation. Please see:

Build Status and CI

Build Status
Linux CPU Python 2 Status
Linux CPU Python 3 Status
Linux GPU Python 2 Status
Linux GPU Python 3 Status

Because of manylinux2010 requirement, TensorFlow I/O is built with Ubuntu:16.04 + Developer Toolset 7 (GCC 7.3) on Linux. Configuration with Ubuntu 16.04 with Developer Toolset 7 is not exactly straightforward. If the system have docker installed, then the following command will automatically build manylinux2010 compatible whl package:

#!/usr/bin/env bash

ls dist/*
for f in dist/*.whl; do
  docker run -i --rm -v $PWD:/v -w /v --net=host quay.io/pypa/manylinux2010_x86_64 bash -x -e /v/tools/build/auditwheel repair --plat manylinux2010_x86_64 $f
done
sudo chown -R $(id -nu):$(id -ng) .
ls wheelhouse/*

It takes some time to build, but once complete, there will be python 3.5, 3.6, 3.7 compatible whl packages available in wheelhouse directory.

On macOS, the same command could be used. However, the script expects python in shell and will only generate a whl package that matches the version of python in shell. If you want to build a whl package for a specific python then you have to alias this version of python to python in shell. See .github/workflows/build.yml Auditwheel step for instructions how to do that.

Note the above command is also the command we use when releasing packages for Linux and macOS.

TensorFlow I/O uses both GitHub Workflows and Google CI (Kokoro) for continuous integration. GitHub Workflows is used for macOS build and test. Kokoro is used for Linux build and test. Again, because of the manylinux2010 requirement, on Linux whl packages are always built with Ubuntu 16.04 + Developer Toolset 7. Tests are done on a variatiy of systems with different python3 versions to ensure a good coverage:

Python Ubuntu 18.04 Ubuntu 20.04 macOS + osx9 Windows-2019
2.7 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: N/A
3.7 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: :heavy_check_mark:
3.8 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: :heavy_check_mark:

TensorFlow I/O has integrations with many systems and cloud vendors such as Prometheus, Apache Kafka, Apache Ignite, Google Cloud PubSub, AWS Kinesis, Microsoft Azure Storage, Alibaba Cloud OSS etc.

We tried our best to test against those systems in our continuous integration whenever possible. Some tests such as Prometheus, Kafka, and Ignite are done with live systems, meaning we install Prometheus/Kafka/Ignite on CI machine before the test is run. Some tests such as Kinesis, PubSub, and Azure Storage are done through official or non-official emulators. Offline tests are also performed whenever possible, though systems covered through offine tests may not have the same level of coverage as live systems or emulators.

Live System Emulator CI Integration Offline
Apache Kafka :heavy_check_mark: :heavy_check_mark:
Apache Ignite :heavy_check_mark: :heavy_check_mark:
Prometheus :heavy_check_mark: :heavy_check_mark:
Google PubSub :heavy_check_mark: :heavy_check_mark:
Azure Storage :heavy_check_mark: :heavy_check_mark:
AWS Kinesis :heavy_check_mark: :heavy_check_mark:
Alibaba Cloud OSS :heavy_check_mark:
Google BigTable/BigQuery to be added
Elasticsearch (experimental) :heavy_check_mark: :heavy_check_mark:
MongoDB (experimental) :heavy_check_mark: :heavy_check_mark:

References for emulators:

Community

Additional Information

License

Apache License 2.0

Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

tensorflow_io_nightly-0.18.0.dev20210426232215-cp39-cp39-win_amd64.whl (20.3 MB view details)

Uploaded CPython 3.9 Windows x86-64

tensorflow_io_nightly-0.18.0.dev20210426232215-cp39-cp39-manylinux2010_x86_64.whl (23.7 MB view details)

Uploaded CPython 3.9 manylinux: glibc 2.12+ x86-64

tensorflow_io_nightly-0.18.0.dev20210426232215-cp39-cp39-macosx_10_14_x86_64.whl (20.7 MB view details)

Uploaded CPython 3.9 macOS 10.14+ x86-64

tensorflow_io_nightly-0.18.0.dev20210426232215-cp38-cp38-win_amd64.whl (20.3 MB view details)

Uploaded CPython 3.8 Windows x86-64

tensorflow_io_nightly-0.18.0.dev20210426232215-cp38-cp38-manylinux2010_x86_64.whl (23.7 MB view details)

Uploaded CPython 3.8 manylinux: glibc 2.12+ x86-64

tensorflow_io_nightly-0.18.0.dev20210426232215-cp38-cp38-macosx_10_14_x86_64.whl (20.7 MB view details)

Uploaded CPython 3.8 macOS 10.14+ x86-64

tensorflow_io_nightly-0.18.0.dev20210426232215-cp37-cp37m-win_amd64.whl (20.3 MB view details)

Uploaded CPython 3.7m Windows x86-64

tensorflow_io_nightly-0.18.0.dev20210426232215-cp37-cp37m-manylinux2010_x86_64.whl (23.7 MB view details)

Uploaded CPython 3.7m manylinux: glibc 2.12+ x86-64

tensorflow_io_nightly-0.18.0.dev20210426232215-cp37-cp37m-macosx_10_14_x86_64.whl (20.7 MB view details)

Uploaded CPython 3.7m macOS 10.14+ x86-64

tensorflow_io_nightly-0.18.0.dev20210426232215-cp36-cp36m-win_amd64.whl (20.3 MB view details)

Uploaded CPython 3.6m Windows x86-64

tensorflow_io_nightly-0.18.0.dev20210426232215-cp36-cp36m-manylinux2010_x86_64.whl (23.7 MB view details)

Uploaded CPython 3.6m manylinux: glibc 2.12+ x86-64

tensorflow_io_nightly-0.18.0.dev20210426232215-cp36-cp36m-macosx_10_14_x86_64.whl (20.7 MB view details)

Uploaded CPython 3.6m macOS 10.14+ x86-64

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210426232215-cp39-cp39-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210426232215-cp39-cp39-win_amd64.whl
Algorithm Hash digest
SHA256 f0212fc07f1c29a5114596970de8ef690524d86632e36ec05793f9f5b2d94c7b
MD5 2becb97cf2d1ff8e4dad18d330e47e2d
BLAKE2b-256 e57faf56168fb3ecb327f46cee6448d9cdc1dcfa8634890774147e10767b4a99

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210426232215-cp39-cp39-manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210426232215-cp39-cp39-manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 a9dba1ba39e4bb91d619bcefd422b1f94637e6615c6a6fcb757d46f4bfa0b589
MD5 74fe8d61e17351b3ebf6ac18a0abc6c7
BLAKE2b-256 173c563eb2c8052ebd5cc7017d56aa46409030d723237c513595344b217196a9

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210426232215-cp39-cp39-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210426232215-cp39-cp39-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 76964035fd112c9870aa04b048a886b19dbb6d1f731d7795734f6b32113231e9
MD5 3a360ce3679fa41eeef2efeb6cc97e5b
BLAKE2b-256 77f2695a19ae66f27e14d839eee6a67178dc2a18c9e2de53f3723acc0f08dae1

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210426232215-cp38-cp38-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210426232215-cp38-cp38-win_amd64.whl
Algorithm Hash digest
SHA256 b73855415031766cc4b831d62bf480c042e0c593a418cf490b90501dff100147
MD5 dff196142784a165e5408768c597fd63
BLAKE2b-256 7acb8efdbfcf7b050b5da9d608a0162be7842e0af676b1c562c364d74f16a3fb

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210426232215-cp38-cp38-manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210426232215-cp38-cp38-manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 127d5423e6dc1575637fed86b2e933dd30d97b7f5509a58bbe0d92570201599f
MD5 cdb04c8e2ef47671e61182be8b1bcca0
BLAKE2b-256 4beedb38fc0583a7e1d942bf152f67ed263d79a04dbe7b510bd0bc64ac422108

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210426232215-cp38-cp38-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210426232215-cp38-cp38-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 ebc2494f0aa8c310368ff2c4325789f9579b217396f545116ac04204410672ca
MD5 b26e68d7a5658459f8b6e6c6f1a71963
BLAKE2b-256 323410e3762a933fbe94c3ea90beac895b1cc58bb2242226c94946764d61cb72

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210426232215-cp37-cp37m-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210426232215-cp37-cp37m-win_amd64.whl
Algorithm Hash digest
SHA256 7189926043ce8f0b0b6d3bb033b0591ae1fe62842724ed3a57663249957419a2
MD5 e2df74a6281fc9ec699643b18225f132
BLAKE2b-256 381dc167b4cd13588fca31f37bef6fbf89881e72dd5c145c361fa04d262fc4ff

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210426232215-cp37-cp37m-manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210426232215-cp37-cp37m-manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 ea965fcdb437cf2acff71ac1841b494d6a2cec9959d2d99df481f5bd1a377489
MD5 d80bcd51d07e24248f2f54722fbedb5c
BLAKE2b-256 9ebd1c6661cc6f43170d6fe5e4a031b8babb6d6f20362b8ae95b7c9409c52336

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210426232215-cp37-cp37m-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210426232215-cp37-cp37m-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 e90b1413633bf218553c40b2e754f0943153b88479fe8cd67e185d9c1c570abd
MD5 ad39a5847e91f06d1680829d4771864b
BLAKE2b-256 ee4e80b613ba4313be99f333074d2db279d31392d3b33f1c7a5aee12691d48db

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210426232215-cp36-cp36m-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210426232215-cp36-cp36m-win_amd64.whl
Algorithm Hash digest
SHA256 8223565e0955817e6762b5471a7b61a69c68584d4ce20226a9de518b74e7492d
MD5 713a348dab538f0cf4fa8e2107df0285
BLAKE2b-256 f6e8e0c005c146393066e84552ae8433eb411ee02f5388f12acdb7b0ac7d15b0

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210426232215-cp36-cp36m-manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210426232215-cp36-cp36m-manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 7c3cd97f498847a9ec6f6ebaf347742654f30ddc9b921f955e46addde0e822ae
MD5 15359e00a927c329e3d03f2fc91fe1a7
BLAKE2b-256 e54051a58e27d7af46872d86d585a7144f1e94e2dfe8ea6206545500d69cf935

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210426232215-cp36-cp36m-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210426232215-cp36-cp36m-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 12e87d4dea5aa8067810bd9252d7908de75851087c43c4f10f095f097fef110f
MD5 4df77bcdf59a64dc97faef1ea27dee08
BLAKE2b-256 87c5814719d332f2263bdb4f620bc18b7df7b752c0e98bf1850a27b26de99692

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page