Skip to main content

TensorFlow IO

Project description




TensorFlow I/O

GitHub CI PyPI License Documentation

TensorFlow I/O is a collection of file systems and file formats that are not available in TensorFlow's built-in support. A full list of supported file systems and file formats by TensorFlow I/O can be found here.

The use of tensorflow-io is straightforward with keras. Below is an example to Get Started with TensorFlow with the data processing aspect replaced by tensorflow-io:

import tensorflow as tf
import tensorflow_io as tfio

# Read the MNIST data into the IODataset.
dataset_url = "http://storage.googleapis.com/cvdf-datasets/mnist/"
d_train = tfio.IODataset.from_mnist(
    dataset_url + "train-images-idx3-ubyte.gz",
    dataset_url + "train-labels-idx1-ubyte.gz",
)

# Shuffle the elements of the dataset.
d_train = d_train.shuffle(buffer_size=1024)

# By default image data is uint8, so convert to float32 using map().
d_train = d_train.map(lambda x, y: (tf.image.convert_image_dtype(x, tf.float32), y))

# prepare batches the data just like any other tf.data.Dataset
d_train = d_train.batch(32)

# Build the model.
model = tf.keras.models.Sequential(
    [
        tf.keras.layers.Flatten(input_shape=(28, 28)),
        tf.keras.layers.Dense(512, activation=tf.nn.relu),
        tf.keras.layers.Dropout(0.2),
        tf.keras.layers.Dense(10, activation=tf.nn.softmax),
    ]
)

# Compile the model.
model.compile(
    optimizer="adam", loss="sparse_categorical_crossentropy", metrics=["accuracy"]
)

# Fit the model.
model.fit(d_train, epochs=5, steps_per_epoch=200)

In the above MNIST example, the URL's to access the dataset files are passed directly to the tfio.IODataset.from_mnist API call. This is due to the inherent support that tensorflow-io provides for the HTTP file system, thus eliminating the need for downloading and saving datasets on a local directory.

NOTE: Since tensorflow-io is able to detect and uncompress the MNIST dataset automatically if needed, we can pass the URL's for the compressed files (gzip) to the API call as is.

Please check the official documentation for more detailed and interesting usages of the package.

Installation

Python Package

The tensorflow-io Python package can be installed with pip directly using:

$ pip install tensorflow-io

People who are a little more adventurous can also try our nightly binaries:

$ pip install tensorflow-io-nightly

Docker Images

In addition to the pip packages, the docker images can be used to quickly get started.

For stable builds:

$ docker pull tfsigio/tfio:latest
$ docker run -it --rm --name tfio-latest tfsigio/tfio:latest

For nightly builds:

$ docker pull tfsigio/tfio:nightly
$ docker run -it --rm --name tfio-nightly tfsigio/tfio:nightly

R Package

Once the tensorflow-io Python package has been successfully installed, you can install the development version of the R package from GitHub via the following:

if (!require("remotes")) install.packages("remotes")
remotes::install_github("tensorflow/io", subdir = "R-package")

TensorFlow Version Compatibility

To ensure compatibility with TensorFlow, it is recommended to install a matching version of TensorFlow I/O according to the table below. You can find the list of releases here.

TensorFlow I/O Version TensorFlow Compatibility Release Date
0.17.0 2.4.x Dec 14, 2020
0.16.0 2.3.x Oct 23, 2020
0.15.0 2.3.x Aug 03, 2020
0.14.0 2.2.x Jul 08, 2020
0.13.0 2.2.x May 10, 2020
0.12.0 2.1.x Feb 28, 2020
0.11.0 2.1.x Jan 10, 2020
0.10.0 2.0.x Dec 05, 2019
0.9.1 2.0.x Nov 15, 2019
0.9.0 2.0.x Oct 18, 2019
0.8.1 1.15.x Nov 15, 2019
0.8.0 1.15.x Oct 17, 2019
0.7.2 1.14.x Nov 15, 2019
0.7.1 1.14.x Oct 18, 2019
0.7.0 1.14.x Jul 14, 2019
0.6.0 1.13.x May 29, 2019
0.5.0 1.13.x Apr 12, 2019
0.4.0 1.13.x Mar 01, 2019
0.3.0 1.12.0 Feb 15, 2019
0.2.0 1.12.0 Jan 29, 2019
0.1.0 1.12.0 Dec 16, 2018

Performance Benchmarking

We use github-pages to document the results of API performance benchmarks. The benchmark job is triggered on every commit to master branch and facilitates tracking performance w.r.t commits.

Contributing

Tensorflow I/O is a community led open source project. As such, the project depends on public contributions, bug-fixes, and documentation. Please see:

Build Status and CI

Build Status
Linux CPU Python 2 Status
Linux CPU Python 3 Status
Linux GPU Python 2 Status
Linux GPU Python 3 Status

Because of manylinux2010 requirement, TensorFlow I/O is built with Ubuntu:16.04 + Developer Toolset 7 (GCC 7.3) on Linux. Configuration with Ubuntu 16.04 with Developer Toolset 7 is not exactly straightforward. If the system have docker installed, then the following command will automatically build manylinux2010 compatible whl package:

#!/usr/bin/env bash

ls dist/*
for f in dist/*.whl; do
  docker run -i --rm -v $PWD:/v -w /v --net=host quay.io/pypa/manylinux2010_x86_64 bash -x -e /v/tools/build/auditwheel repair --plat manylinux2010_x86_64 $f
done
sudo chown -R $(id -nu):$(id -ng) .
ls wheelhouse/*

It takes some time to build, but once complete, there will be python 3.5, 3.6, 3.7 compatible whl packages available in wheelhouse directory.

On macOS, the same command could be used. However, the script expects python in shell and will only generate a whl package that matches the version of python in shell. If you want to build a whl package for a specific python then you have to alias this version of python to python in shell. See .github/workflows/build.yml Auditwheel step for instructions how to do that.

Note the above command is also the command we use when releasing packages for Linux and macOS.

TensorFlow I/O uses both GitHub Workflows and Google CI (Kokoro) for continuous integration. GitHub Workflows is used for macOS build and test. Kokoro is used for Linux build and test. Again, because of the manylinux2010 requirement, on Linux whl packages are always built with Ubuntu 16.04 + Developer Toolset 7. Tests are done on a variatiy of systems with different python3 versions to ensure a good coverage:

Python Ubuntu 18.04 Ubuntu 20.04 macOS + osx9 Windows-2019
2.7 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: N/A
3.7 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: :heavy_check_mark:
3.8 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: :heavy_check_mark:

TensorFlow I/O has integrations with many systems and cloud vendors such as Prometheus, Apache Kafka, Apache Ignite, Google Cloud PubSub, AWS Kinesis, Microsoft Azure Storage, Alibaba Cloud OSS etc.

We tried our best to test against those systems in our continuous integration whenever possible. Some tests such as Prometheus, Kafka, and Ignite are done with live systems, meaning we install Prometheus/Kafka/Ignite on CI machine before the test is run. Some tests such as Kinesis, PubSub, and Azure Storage are done through official or non-official emulators. Offline tests are also performed whenever possible, though systems covered through offine tests may not have the same level of coverage as live systems or emulators.

Live System Emulator CI Integration Offline
Apache Kafka :heavy_check_mark: :heavy_check_mark:
Apache Ignite :heavy_check_mark: :heavy_check_mark:
Prometheus :heavy_check_mark: :heavy_check_mark:
Google PubSub :heavy_check_mark: :heavy_check_mark:
Azure Storage :heavy_check_mark: :heavy_check_mark:
AWS Kinesis :heavy_check_mark: :heavy_check_mark:
Alibaba Cloud OSS :heavy_check_mark:
Google BigTable/BigQuery to be added
Elasticsearch (experimental) :heavy_check_mark: :heavy_check_mark:
MongoDB (experimental) :heavy_check_mark: :heavy_check_mark:

References for emulators:

Community

Additional Information

License

Apache License 2.0

Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

tensorflow_io_nightly-0.18.0.dev20210406165640-cp39-cp39-win_amd64.whl (21.3 MB view details)

Uploaded CPython 3.9 Windows x86-64

tensorflow_io_nightly-0.18.0.dev20210406165640-cp39-cp39-manylinux2010_x86_64.whl (25.4 MB view details)

Uploaded CPython 3.9 manylinux: glibc 2.12+ x86-64

tensorflow_io_nightly-0.18.0.dev20210406165640-cp39-cp39-macosx_10_14_x86_64.whl (21.5 MB view details)

Uploaded CPython 3.9 macOS 10.14+ x86-64

tensorflow_io_nightly-0.18.0.dev20210406165640-cp38-cp38-win_amd64.whl (21.3 MB view details)

Uploaded CPython 3.8 Windows x86-64

tensorflow_io_nightly-0.18.0.dev20210406165640-cp38-cp38-manylinux2010_x86_64.whl (25.4 MB view details)

Uploaded CPython 3.8 manylinux: glibc 2.12+ x86-64

tensorflow_io_nightly-0.18.0.dev20210406165640-cp38-cp38-macosx_10_14_x86_64.whl (21.5 MB view details)

Uploaded CPython 3.8 macOS 10.14+ x86-64

tensorflow_io_nightly-0.18.0.dev20210406165640-cp37-cp37m-win_amd64.whl (21.3 MB view details)

Uploaded CPython 3.7m Windows x86-64

tensorflow_io_nightly-0.18.0.dev20210406165640-cp37-cp37m-manylinux2010_x86_64.whl (25.4 MB view details)

Uploaded CPython 3.7m manylinux: glibc 2.12+ x86-64

tensorflow_io_nightly-0.18.0.dev20210406165640-cp37-cp37m-macosx_10_14_x86_64.whl (21.5 MB view details)

Uploaded CPython 3.7m macOS 10.14+ x86-64

tensorflow_io_nightly-0.18.0.dev20210406165640-cp36-cp36m-win_amd64.whl (21.3 MB view details)

Uploaded CPython 3.6m Windows x86-64

tensorflow_io_nightly-0.18.0.dev20210406165640-cp36-cp36m-manylinux2010_x86_64.whl (25.4 MB view details)

Uploaded CPython 3.6m manylinux: glibc 2.12+ x86-64

tensorflow_io_nightly-0.18.0.dev20210406165640-cp36-cp36m-macosx_10_14_x86_64.whl (21.5 MB view details)

Uploaded CPython 3.6m macOS 10.14+ x86-64

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210406165640-cp39-cp39-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210406165640-cp39-cp39-win_amd64.whl
Algorithm Hash digest
SHA256 65d890fc930d2d764384f2b2252de52efa56f3cbb80edb6fe7075d0653bdc7b5
MD5 d062f27a4da73ec577a886458879fc6f
BLAKE2b-256 cb2369ec4329b2134e06be67bfdb8f6909ba5f80048741110a808a28eb80521a

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210406165640-cp39-cp39-manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210406165640-cp39-cp39-manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 55f3e1210f5f795ef81298700811ce60774451df6bb022ff4d73d88eaa398325
MD5 345a430ce3ba6b984d4cbad578d2d426
BLAKE2b-256 b265c0c37c96b36f9de94c6a3521445d6afb61b4410398a3ceb4785a0ec143bd

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210406165640-cp39-cp39-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210406165640-cp39-cp39-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 2f1e1bc5b11e4d7b809004fa34cef11152a210931d43dc6d7b13618260540754
MD5 7ac2fc877474caf3a1922a233d143ac2
BLAKE2b-256 55c1899f18163df6a333d6dc923ba026cc76c658906e5ac1d7cb30de397600ec

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210406165640-cp38-cp38-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210406165640-cp38-cp38-win_amd64.whl
Algorithm Hash digest
SHA256 1f5c6573c6d75a1031e0b574ea9f6344f26ba33db30eb131e09ad8a21ff47fd7
MD5 b4bba18a38ec9e450cc1923152983391
BLAKE2b-256 0117718c4515144f4ad3808763d9f1857b69d2d7731f7144b3d8a1ece8e66811

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210406165640-cp38-cp38-manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210406165640-cp38-cp38-manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 89010a5a186e42ed07a477b9104b295c9e1d69117038159aba5a34647be8dd8a
MD5 42492d591a8f8694084777ec32aca236
BLAKE2b-256 9a02bd03b963e6d165a3d701db0e058eaa3938e23961d084cd0d4217b2e6c11d

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210406165640-cp38-cp38-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210406165640-cp38-cp38-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 912bdb9322ec7e3c6112cdce99f19f20ad616e059ae8c4a268d5e7165846fbd8
MD5 7ee9f6ddcc728aef6f128edd61509093
BLAKE2b-256 e7ab8904a20ad51065ceb8b870c0ba2bf6c53dde39619147fae4612ef178b6ba

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210406165640-cp37-cp37m-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210406165640-cp37-cp37m-win_amd64.whl
Algorithm Hash digest
SHA256 38dc0957c8cb3a5017dbb5a3717713976a2efcfa54f43c153e5c75fce8a2921a
MD5 4cf823825735b90790338c8b094aa722
BLAKE2b-256 2079393871e5bc69a8091f47e487664fa1d0ccf6d3ecd2ff602a47a0b00462f8

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210406165640-cp37-cp37m-manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210406165640-cp37-cp37m-manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 e25b5ad28337fc50e36e8512b57c837cd9c8daa44529555aadd8ab792f989c0a
MD5 d521101da8207d0f75a073eb7910967a
BLAKE2b-256 b0d2c7b472a48e888ee506d1512feb93cdcf059da4f946113819f3e1e4d04ca3

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210406165640-cp37-cp37m-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210406165640-cp37-cp37m-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 f4bb31ae8617c05fa90c28fa6ea6e660e2fab9f8f1d07beb0095a6e9ce2bc02a
MD5 29917eb0e56a1a54bedad82a2223d7ff
BLAKE2b-256 db0e526858f4df1f23c2901b000a54e66c7d214ce30db771ea963498730696fa

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210406165640-cp36-cp36m-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210406165640-cp36-cp36m-win_amd64.whl
Algorithm Hash digest
SHA256 064794edf0e602ced0ffcf844adff192e6515af70f7e160f7125a49704f6aeb3
MD5 9554f1bbd7caa478b135bbb929cdd57b
BLAKE2b-256 24226075513d25d22b886d6e7dfd880314806e089065fee78a54a7f375b28e96

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210406165640-cp36-cp36m-manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210406165640-cp36-cp36m-manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 cd371864070bb3501195bdcad0aca98b1acd408b9965c183e5ccf6626f42f93c
MD5 ab11a16ace98cc2906db9452ff4336c7
BLAKE2b-256 908fa6da32fe0033ec385c22219429c01a802c5652f132da551cd738c07c8be2

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210406165640-cp36-cp36m-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210406165640-cp36-cp36m-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 b8748e2d9fb19ea2cc3ebb4e953c3570d18dc878ab5a4f2a96f667e520b56e38
MD5 00d9b36e431baa3d4968be0521eee721
BLAKE2b-256 363eada9c6857b1f00a714e5306a30c15bb0b46927762bb1bbd097a2252933e8

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page