Skip to main content

TensorFlow IO

Project description




TensorFlow I/O

GitHub CI PyPI License Documentation

TensorFlow I/O is a collection of file systems and file formats that are not available in TensorFlow's built-in support. A full list of supported file systems and file formats by TensorFlow I/O can be found here.

The use of tensorflow-io is straightforward with keras. Below is an example to Get Started with TensorFlow with the data processing aspect replaced by tensorflow-io:

import tensorflow as tf
import tensorflow_io as tfio

# Read the MNIST data into the IODataset.
dataset_url = "https://storage.googleapis.com/cvdf-datasets/mnist/"
d_train = tfio.IODataset.from_mnist(
    dataset_url + "train-images-idx3-ubyte.gz",
    dataset_url + "train-labels-idx1-ubyte.gz",
)

# Shuffle the elements of the dataset.
d_train = d_train.shuffle(buffer_size=1024)

# By default image data is uint8, so convert to float32 using map().
d_train = d_train.map(lambda x, y: (tf.image.convert_image_dtype(x, tf.float32), y))

# prepare batches the data just like any other tf.data.Dataset
d_train = d_train.batch(32)

# Build the model.
model = tf.keras.models.Sequential(
    [
        tf.keras.layers.Flatten(input_shape=(28, 28)),
        tf.keras.layers.Dense(512, activation=tf.nn.relu),
        tf.keras.layers.Dropout(0.2),
        tf.keras.layers.Dense(10, activation=tf.nn.softmax),
    ]
)

# Compile the model.
model.compile(
    optimizer="adam", loss="sparse_categorical_crossentropy", metrics=["accuracy"]
)

# Fit the model.
model.fit(d_train, epochs=5, steps_per_epoch=200)

In the above MNIST example, the URL's to access the dataset files are passed directly to the tfio.IODataset.from_mnist API call. This is due to the inherent support that tensorflow-io provides for HTTP/HTTPS file system, thus eliminating the need for downloading and saving datasets on a local directory.

NOTE: Since tensorflow-io is able to detect and uncompress the MNIST dataset automatically if needed, we can pass the URL's for the compressed files (gzip) to the API call as is.

Please check the official documentation for more detailed and interesting usages of the package.

Installation

Python Package

The tensorflow-io Python package can be installed with pip directly using:

$ pip install tensorflow-io

People who are a little more adventurous can also try our nightly binaries:

$ pip install tensorflow-io-nightly

Docker Images

In addition to the pip packages, the docker images can be used to quickly get started.

For stable builds:

$ docker pull tfsigio/tfio:latest
$ docker run -it --rm --name tfio-latest tfsigio/tfio:latest

For nightly builds:

$ docker pull tfsigio/tfio:nightly
$ docker run -it --rm --name tfio-nightly tfsigio/tfio:nightly

R Package

Once the tensorflow-io Python package has been successfully installed, you can install the development version of the R package from GitHub via the following:

if (!require("remotes")) install.packages("remotes")
remotes::install_github("tensorflow/io", subdir = "R-package")

TensorFlow Version Compatibility

To ensure compatibility with TensorFlow, it is recommended to install a matching version of TensorFlow I/O according to the table below. You can find the list of releases here.

TensorFlow I/O Version TensorFlow Compatibility Release Date
0.20.0 2.6.x Aug 11, 2021
0.19.1 2.5.x Jul 25, 2021
0.19.0 2.5.x Jun 25, 2021
0.18.0 2.5.x May 13, 2021
0.17.1 2.4.x Apr 16, 2021
0.17.0 2.4.x Dec 14, 2020
0.16.0 2.3.x Oct 23, 2020
0.15.0 2.3.x Aug 03, 2020
0.14.0 2.2.x Jul 08, 2020
0.13.0 2.2.x May 10, 2020
0.12.0 2.1.x Feb 28, 2020
0.11.0 2.1.x Jan 10, 2020
0.10.0 2.0.x Dec 05, 2019
0.9.1 2.0.x Nov 15, 2019
0.9.0 2.0.x Oct 18, 2019
0.8.1 1.15.x Nov 15, 2019
0.8.0 1.15.x Oct 17, 2019
0.7.2 1.14.x Nov 15, 2019
0.7.1 1.14.x Oct 18, 2019
0.7.0 1.14.x Jul 14, 2019
0.6.0 1.13.x May 29, 2019
0.5.0 1.13.x Apr 12, 2019
0.4.0 1.13.x Mar 01, 2019
0.3.0 1.12.0 Feb 15, 2019
0.2.0 1.12.0 Jan 29, 2019
0.1.0 1.12.0 Dec 16, 2018

Performance Benchmarking

We use github-pages to document the results of API performance benchmarks. The benchmark job is triggered on every commit to master branch and facilitates tracking performance w.r.t commits.

Contributing

Tensorflow I/O is a community led open source project. As such, the project depends on public contributions, bug-fixes, and documentation. Please see:

Build Status and CI

Build Status
Linux CPU Python 2 Status
Linux CPU Python 3 Status
Linux GPU Python 2 Status
Linux GPU Python 3 Status

Because of manylinux2010 requirement, TensorFlow I/O is built with Ubuntu:16.04 + Developer Toolset 7 (GCC 7.3) on Linux. Configuration with Ubuntu 16.04 with Developer Toolset 7 is not exactly straightforward. If the system have docker installed, then the following command will automatically build manylinux2010 compatible whl package:

#!/usr/bin/env bash

ls dist/*
for f in dist/*.whl; do
  docker run -i --rm -v $PWD:/v -w /v --net=host quay.io/pypa/manylinux2010_x86_64 bash -x -e /v/tools/build/auditwheel repair --plat manylinux2010_x86_64 $f
done
sudo chown -R $(id -nu):$(id -ng) .
ls wheelhouse/*

It takes some time to build, but once complete, there will be python 3.5, 3.6, 3.7 compatible whl packages available in wheelhouse directory.

On macOS, the same command could be used. However, the script expects python in shell and will only generate a whl package that matches the version of python in shell. If you want to build a whl package for a specific python then you have to alias this version of python to python in shell. See .github/workflows/build.yml Auditwheel step for instructions how to do that.

Note the above command is also the command we use when releasing packages for Linux and macOS.

TensorFlow I/O uses both GitHub Workflows and Google CI (Kokoro) for continuous integration. GitHub Workflows is used for macOS build and test. Kokoro is used for Linux build and test. Again, because of the manylinux2010 requirement, on Linux whl packages are always built with Ubuntu 16.04 + Developer Toolset 7. Tests are done on a variatiy of systems with different python3 versions to ensure a good coverage:

Python Ubuntu 18.04 Ubuntu 20.04 macOS + osx9 Windows-2019
2.7 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: N/A
3.7 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: :heavy_check_mark:
3.8 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: :heavy_check_mark:

TensorFlow I/O has integrations with many systems and cloud vendors such as Prometheus, Apache Kafka, Apache Ignite, Google Cloud PubSub, AWS Kinesis, Microsoft Azure Storage, Alibaba Cloud OSS etc.

We tried our best to test against those systems in our continuous integration whenever possible. Some tests such as Prometheus, Kafka, and Ignite are done with live systems, meaning we install Prometheus/Kafka/Ignite on CI machine before the test is run. Some tests such as Kinesis, PubSub, and Azure Storage are done through official or non-official emulators. Offline tests are also performed whenever possible, though systems covered through offine tests may not have the same level of coverage as live systems or emulators.

Live System Emulator CI Integration Offline
Apache Kafka :heavy_check_mark: :heavy_check_mark:
Apache Ignite :heavy_check_mark: :heavy_check_mark:
Prometheus :heavy_check_mark: :heavy_check_mark:
Google PubSub :heavy_check_mark: :heavy_check_mark:
Azure Storage :heavy_check_mark: :heavy_check_mark:
AWS Kinesis :heavy_check_mark: :heavy_check_mark:
Alibaba Cloud OSS :heavy_check_mark:
Google BigTable/BigQuery to be added
Elasticsearch (experimental) :heavy_check_mark: :heavy_check_mark:
MongoDB (experimental) :heavy_check_mark: :heavy_check_mark:

References for emulators:

Community

Additional Information

License

Apache License 2.0

Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

tensorflow_io_nightly-0.20.0.dev20210826002402-cp39-cp39-win_amd64.whl (21.3 MB view details)

Uploaded CPython 3.9 Windows x86-64

tensorflow_io_nightly-0.20.0.dev20210826002402-cp39-cp39-macosx_10_14_x86_64.whl (22.8 MB view details)

Uploaded CPython 3.9 macOS 10.14+ x86-64

tensorflow_io_nightly-0.20.0.dev20210826002402-cp38-cp38-win_amd64.whl (21.3 MB view details)

Uploaded CPython 3.8 Windows x86-64

tensorflow_io_nightly-0.20.0.dev20210826002402-cp38-cp38-macosx_10_14_x86_64.whl (22.8 MB view details)

Uploaded CPython 3.8 macOS 10.14+ x86-64

tensorflow_io_nightly-0.20.0.dev20210826002402-cp37-cp37m-win_amd64.whl (21.3 MB view details)

Uploaded CPython 3.7m Windows x86-64

tensorflow_io_nightly-0.20.0.dev20210826002402-cp37-cp37m-macosx_10_14_x86_64.whl (22.8 MB view details)

Uploaded CPython 3.7m macOS 10.14+ x86-64

tensorflow_io_nightly-0.20.0.dev20210826002402-cp36-cp36m-win_amd64.whl (21.3 MB view details)

Uploaded CPython 3.6m Windows x86-64

tensorflow_io_nightly-0.20.0.dev20210826002402-cp36-cp36m-macosx_10_14_x86_64.whl (22.8 MB view details)

Uploaded CPython 3.6m macOS 10.14+ x86-64

File details

Details for the file tensorflow_io_nightly-0.20.0.dev20210826002402-cp39-cp39-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.20.0.dev20210826002402-cp39-cp39-win_amd64.whl
Algorithm Hash digest
SHA256 338367c57f35d9b617eac618b4d3bae225a0e4eac9d3d049c1b120af5d41bd48
MD5 bf1dd72258ad6d3ed6d993775d86ff40
BLAKE2b-256 a88261a4a63d920d22e5ab6194f7a13409a02d0e27724fd73e58bcc63aef8b32

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.20.0.dev20210826002402-cp39-cp39-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.20.0.dev20210826002402-cp39-cp39-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 76494b243a909ee3fba051684c8b6102a6842e90b43b07e2ebaa9a8c420c565f
MD5 66d6b6a76c5d0dc45ecca47d1f7baa2b
BLAKE2b-256 4fb0c44b7d6ca7681a3487610b990ce8361f13aec265878a9d74140bde822a3c

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.20.0.dev20210826002402-cp39-cp39-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.20.0.dev20210826002402-cp39-cp39-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 f024f62459718f9c543fbfe03fd743cea60d115189c26e891f9b7fa9f176b745
MD5 5327890237223f0c1d8aa47df22241f3
BLAKE2b-256 f2221b3a89f9ffc409dbb5edff1005e77c2d605e51775553b026dac07c1978ad

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.20.0.dev20210826002402-cp38-cp38-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.20.0.dev20210826002402-cp38-cp38-win_amd64.whl
Algorithm Hash digest
SHA256 e000d3be906975139d8303c76d6b2e4254f282709d93480869bc153e75614683
MD5 1641888c2891d09f23c0909f12d3fcda
BLAKE2b-256 e34752269443988ffc8c57d51668f3086b178fb993a5c03332fee856b5b2b817

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.20.0.dev20210826002402-cp38-cp38-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.20.0.dev20210826002402-cp38-cp38-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 8d15888ff27989851d0cf2be10a070ef8710220a7bec839302f004c93a3d294f
MD5 32eb8e99df36322923c1b967bef301ba
BLAKE2b-256 cc706ae4da994e1ddc1e20a55bf609a63b57987074853e36d4b1e6734e6e3a0f

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.20.0.dev20210826002402-cp38-cp38-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.20.0.dev20210826002402-cp38-cp38-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 4b02ad9d8762c704315e6fab663973554a9872b1bc8d0771e6794f23acb1090d
MD5 3deb9f81605826f2de86fc6f645332df
BLAKE2b-256 54fa6f334178141d34321d499c39517227c522b35a6f2f8a721f4da961e4eb2a

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.20.0.dev20210826002402-cp37-cp37m-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.20.0.dev20210826002402-cp37-cp37m-win_amd64.whl
Algorithm Hash digest
SHA256 453b8a454691336325f41a1e8ead1050440b9b8c0f980fee192d96eaa1f3ec74
MD5 3ee84c7dbd81282c23ceacf8c3ca66da
BLAKE2b-256 9aa0b70c9e267b48570a5b2b5181a923ce18c9f1f335a543d12d5f0d3bf33a65

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.20.0.dev20210826002402-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.20.0.dev20210826002402-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 dcbf65b659a5b868d38a8399b05b2a187d8f4bbbecb0366491586d678205f1b5
MD5 5a8fa18d46e03551723da18aab4ff263
BLAKE2b-256 55f298ceb849c92bed7987b961f74fc679b65eaf12889516e0653a18e5092ad2

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.20.0.dev20210826002402-cp37-cp37m-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.20.0.dev20210826002402-cp37-cp37m-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 67d74aa0806136286db2c041047b3ea526efa1eef12e02051b1f085d5a5d8a7a
MD5 9086f7587077ccdc8936aeeefdca7b25
BLAKE2b-256 e2aea8a1769e3c6263adc7d3c68bf3b0aa939b5e33a355a2f806aa3d6e0216f0

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.20.0.dev20210826002402-cp36-cp36m-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.20.0.dev20210826002402-cp36-cp36m-win_amd64.whl
Algorithm Hash digest
SHA256 babe25154ed9e6c4993e223586bfb95a4ab410ad865a846dfc4f4d7c5b8e80cf
MD5 a12dbd460ab6f77fe0f3c4a40f5fef5f
BLAKE2b-256 39c838daaafc01c44b97482682a47a4ac114311ddb3c3984d6d88e70a1754c2f

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.20.0.dev20210826002402-cp36-cp36m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.20.0.dev20210826002402-cp36-cp36m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 e221c8e4a0fc9fc00da05111df6499f106c2775ece76747c6eb2ccb5f1cfa593
MD5 b8807b67e7ad2201ea041d977775f79f
BLAKE2b-256 17e0be7a636c65d82170723b51ebd19e5d9729ffd7e8e4179d5a556df159839d

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.20.0.dev20210826002402-cp36-cp36m-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.20.0.dev20210826002402-cp36-cp36m-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 3d8f24d08ff12a57cc87c625386b57328e6ab18d4e119d5516ed0faae1229d34
MD5 07301162802d6e198dfb399eebcff7fd
BLAKE2b-256 aa3529adb821a8a328c328adb5d26677c1d939f5e5aba881683f50ec3a4792ec

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page