Skip to main content

TensorFlow IO

Project description




TensorFlow I/O

GitHub CI PyPI License Documentation

TensorFlow I/O is a collection of file systems and file formats that are not available in TensorFlow's built-in support. A full list of supported file systems and file formats by TensorFlow I/O can be found here.

The use of tensorflow-io is straightforward with keras. Below is an example to Get Started with TensorFlow with the data processing aspect replaced by tensorflow-io:

import tensorflow as tf
import tensorflow_io as tfio

# Read the MNIST data into the IODataset.
dataset_url = "https://storage.googleapis.com/cvdf-datasets/mnist/"
d_train = tfio.IODataset.from_mnist(
    dataset_url + "train-images-idx3-ubyte.gz",
    dataset_url + "train-labels-idx1-ubyte.gz",
)

# Shuffle the elements of the dataset.
d_train = d_train.shuffle(buffer_size=1024)

# By default image data is uint8, so convert to float32 using map().
d_train = d_train.map(lambda x, y: (tf.image.convert_image_dtype(x, tf.float32), y))

# prepare batches the data just like any other tf.data.Dataset
d_train = d_train.batch(32)

# Build the model.
model = tf.keras.models.Sequential(
    [
        tf.keras.layers.Flatten(input_shape=(28, 28)),
        tf.keras.layers.Dense(512, activation=tf.nn.relu),
        tf.keras.layers.Dropout(0.2),
        tf.keras.layers.Dense(10, activation=tf.nn.softmax),
    ]
)

# Compile the model.
model.compile(
    optimizer="adam", loss="sparse_categorical_crossentropy", metrics=["accuracy"]
)

# Fit the model.
model.fit(d_train, epochs=5, steps_per_epoch=200)

In the above MNIST example, the URL's to access the dataset files are passed directly to the tfio.IODataset.from_mnist API call. This is due to the inherent support that tensorflow-io provides for HTTP/HTTPS file system, thus eliminating the need for downloading and saving datasets on a local directory.

NOTE: Since tensorflow-io is able to detect and uncompress the MNIST dataset automatically if needed, we can pass the URL's for the compressed files (gzip) to the API call as is.

Please check the official documentation for more detailed and interesting usages of the package.

Installation

Python Package

The tensorflow-io Python package can be installed with pip directly using:

$ pip install tensorflow-io

People who are a little more adventurous can also try our nightly binaries:

$ pip install tensorflow-io-nightly

Docker Images

In addition to the pip packages, the docker images can be used to quickly get started.

For stable builds:

$ docker pull tfsigio/tfio:latest
$ docker run -it --rm --name tfio-latest tfsigio/tfio:latest

For nightly builds:

$ docker pull tfsigio/tfio:nightly
$ docker run -it --rm --name tfio-nightly tfsigio/tfio:nightly

R Package

Once the tensorflow-io Python package has been successfully installed, you can install the development version of the R package from GitHub via the following:

if (!require("remotes")) install.packages("remotes")
remotes::install_github("tensorflow/io", subdir = "R-package")

TensorFlow Version Compatibility

To ensure compatibility with TensorFlow, it is recommended to install a matching version of TensorFlow I/O according to the table below. You can find the list of releases here.

TensorFlow I/O Version TensorFlow Compatibility Release Date
0.19.0 2.5.x Jun 25, 2021
0.18.0 2.5.x May 13, 2021
0.17.1 2.4.x Apr 16, 2021
0.17.0 2.4.x Dec 14, 2020
0.16.0 2.3.x Oct 23, 2020
0.15.0 2.3.x Aug 03, 2020
0.14.0 2.2.x Jul 08, 2020
0.13.0 2.2.x May 10, 2020
0.12.0 2.1.x Feb 28, 2020
0.11.0 2.1.x Jan 10, 2020
0.10.0 2.0.x Dec 05, 2019
0.9.1 2.0.x Nov 15, 2019
0.9.0 2.0.x Oct 18, 2019
0.8.1 1.15.x Nov 15, 2019
0.8.0 1.15.x Oct 17, 2019
0.7.2 1.14.x Nov 15, 2019
0.7.1 1.14.x Oct 18, 2019
0.7.0 1.14.x Jul 14, 2019
0.6.0 1.13.x May 29, 2019
0.5.0 1.13.x Apr 12, 2019
0.4.0 1.13.x Mar 01, 2019
0.3.0 1.12.0 Feb 15, 2019
0.2.0 1.12.0 Jan 29, 2019
0.1.0 1.12.0 Dec 16, 2018

Performance Benchmarking

We use github-pages to document the results of API performance benchmarks. The benchmark job is triggered on every commit to master branch and facilitates tracking performance w.r.t commits.

Contributing

Tensorflow I/O is a community led open source project. As such, the project depends on public contributions, bug-fixes, and documentation. Please see:

Build Status and CI

Build Status
Linux CPU Python 2 Status
Linux CPU Python 3 Status
Linux GPU Python 2 Status
Linux GPU Python 3 Status

Because of manylinux2010 requirement, TensorFlow I/O is built with Ubuntu:16.04 + Developer Toolset 7 (GCC 7.3) on Linux. Configuration with Ubuntu 16.04 with Developer Toolset 7 is not exactly straightforward. If the system have docker installed, then the following command will automatically build manylinux2010 compatible whl package:

#!/usr/bin/env bash

ls dist/*
for f in dist/*.whl; do
  docker run -i --rm -v $PWD:/v -w /v --net=host quay.io/pypa/manylinux2010_x86_64 bash -x -e /v/tools/build/auditwheel repair --plat manylinux2010_x86_64 $f
done
sudo chown -R $(id -nu):$(id -ng) .
ls wheelhouse/*

It takes some time to build, but once complete, there will be python 3.5, 3.6, 3.7 compatible whl packages available in wheelhouse directory.

On macOS, the same command could be used. However, the script expects python in shell and will only generate a whl package that matches the version of python in shell. If you want to build a whl package for a specific python then you have to alias this version of python to python in shell. See .github/workflows/build.yml Auditwheel step for instructions how to do that.

Note the above command is also the command we use when releasing packages for Linux and macOS.

TensorFlow I/O uses both GitHub Workflows and Google CI (Kokoro) for continuous integration. GitHub Workflows is used for macOS build and test. Kokoro is used for Linux build and test. Again, because of the manylinux2010 requirement, on Linux whl packages are always built with Ubuntu 16.04 + Developer Toolset 7. Tests are done on a variatiy of systems with different python3 versions to ensure a good coverage:

Python Ubuntu 18.04 Ubuntu 20.04 macOS + osx9 Windows-2019
2.7 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: N/A
3.7 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: :heavy_check_mark:
3.8 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: :heavy_check_mark:

TensorFlow I/O has integrations with many systems and cloud vendors such as Prometheus, Apache Kafka, Apache Ignite, Google Cloud PubSub, AWS Kinesis, Microsoft Azure Storage, Alibaba Cloud OSS etc.

We tried our best to test against those systems in our continuous integration whenever possible. Some tests such as Prometheus, Kafka, and Ignite are done with live systems, meaning we install Prometheus/Kafka/Ignite on CI machine before the test is run. Some tests such as Kinesis, PubSub, and Azure Storage are done through official or non-official emulators. Offline tests are also performed whenever possible, though systems covered through offine tests may not have the same level of coverage as live systems or emulators.

Live System Emulator CI Integration Offline
Apache Kafka :heavy_check_mark: :heavy_check_mark:
Apache Ignite :heavy_check_mark: :heavy_check_mark:
Prometheus :heavy_check_mark: :heavy_check_mark:
Google PubSub :heavy_check_mark: :heavy_check_mark:
Azure Storage :heavy_check_mark: :heavy_check_mark:
AWS Kinesis :heavy_check_mark: :heavy_check_mark:
Alibaba Cloud OSS :heavy_check_mark:
Google BigTable/BigQuery to be added
Elasticsearch (experimental) :heavy_check_mark: :heavy_check_mark:
MongoDB (experimental) :heavy_check_mark: :heavy_check_mark:

References for emulators:

Community

Additional Information

License

Apache License 2.0

Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

tensorflow_io_nightly-0.20.0.dev20210714135018-cp39-cp39-win_amd64.whl (21.3 MB view details)

Uploaded CPython 3.9 Windows x86-64

tensorflow_io_nightly-0.20.0.dev20210714135018-cp39-cp39-macosx_10_14_x86_64.whl (22.8 MB view details)

Uploaded CPython 3.9 macOS 10.14+ x86-64

tensorflow_io_nightly-0.20.0.dev20210714135018-cp38-cp38-win_amd64.whl (21.3 MB view details)

Uploaded CPython 3.8 Windows x86-64

tensorflow_io_nightly-0.20.0.dev20210714135018-cp38-cp38-macosx_10_14_x86_64.whl (22.8 MB view details)

Uploaded CPython 3.8 macOS 10.14+ x86-64

tensorflow_io_nightly-0.20.0.dev20210714135018-cp37-cp37m-win_amd64.whl (21.3 MB view details)

Uploaded CPython 3.7m Windows x86-64

tensorflow_io_nightly-0.20.0.dev20210714135018-cp37-cp37m-macosx_10_14_x86_64.whl (22.8 MB view details)

Uploaded CPython 3.7m macOS 10.14+ x86-64

tensorflow_io_nightly-0.20.0.dev20210714135018-cp36-cp36m-win_amd64.whl (21.3 MB view details)

Uploaded CPython 3.6m Windows x86-64

tensorflow_io_nightly-0.20.0.dev20210714135018-cp36-cp36m-macosx_10_14_x86_64.whl (22.8 MB view details)

Uploaded CPython 3.6m macOS 10.14+ x86-64

File details

Details for the file tensorflow_io_nightly-0.20.0.dev20210714135018-cp39-cp39-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.20.0.dev20210714135018-cp39-cp39-win_amd64.whl
Algorithm Hash digest
SHA256 5a298946f221f7b3aca0e106857d8a7ebfb6f04963217ec4bc57b555806a0a4b
MD5 1f4633ea23dba68496f92c343aa3234c
BLAKE2b-256 bb6eb38409d1bc859db3b453268dd5f6b503c77462c6f0b0bc6020f5ce8c9ad8

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.20.0.dev20210714135018-cp39-cp39-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.20.0.dev20210714135018-cp39-cp39-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 b661b752944d7a130d148cfce9be5f4254a70884ecbb8143bc86cfb47ec9c2b2
MD5 f72aa7c326a0c4a9b1c3908a5941a705
BLAKE2b-256 7192625dbf9c28b59e51cc0d5fdddb57c0ff5a48c096d6dd0dd483992bdb1c9f

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.20.0.dev20210714135018-cp39-cp39-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.20.0.dev20210714135018-cp39-cp39-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 90fe9718861c5a537b888995a26e653e914ce046887c8525295a824cfe02400d
MD5 2fbb8802b599f9fe5be4668acf85b8f0
BLAKE2b-256 e910f4a505f84624bf91e46d2331e131471d2d44264b494445b72b57e062000e

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.20.0.dev20210714135018-cp38-cp38-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.20.0.dev20210714135018-cp38-cp38-win_amd64.whl
Algorithm Hash digest
SHA256 7afcc643274972b1742bd17058d6b7d01134525ddec1d857f67a2db9de47f0f0
MD5 a311a3dc23bb5e3716c61d41eb694c9f
BLAKE2b-256 0f7c1a08f56b6b8d2223608c778c9e8e6e46f2c985b3394a8fe0a4231a20f998

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.20.0.dev20210714135018-cp38-cp38-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.20.0.dev20210714135018-cp38-cp38-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 12c22e6e134b2d07cf2d3f4b912b923e0720219822f6680239232d917d4a7cc8
MD5 84339ce21e6b1dd3756e8da7287b8621
BLAKE2b-256 a8e9fa8ca8da69db9503af4bc9e70f30123d57d00b5357f0f8fb504bfc9d1d30

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.20.0.dev20210714135018-cp38-cp38-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.20.0.dev20210714135018-cp38-cp38-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 e0aad5576b977dba92b32291622e036f99b9a4e9e4cb57c43fcb9a4dadcfc82b
MD5 9254c7b725e34468a2458bdf901dac17
BLAKE2b-256 39df3ca9edb2fe15078f7476bc712d89f3b99cb3d7a0bcfdc3d96099cc68c7aa

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.20.0.dev20210714135018-cp37-cp37m-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.20.0.dev20210714135018-cp37-cp37m-win_amd64.whl
Algorithm Hash digest
SHA256 6d67e1de3490940bc642eda442e7f2da35559c00e3b916fdd9684ffe2157d0d4
MD5 76145159faa72ab6ecbaa5958e41ffdd
BLAKE2b-256 deb25737311d6ad2d5d33ad044ae3e8a795f32d1e069f7c96c97aae7a4bd71b7

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.20.0.dev20210714135018-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.20.0.dev20210714135018-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 0cbd8dfaaed63307759d601d90f80ddfd8e026987f722ba1795921abebf6890d
MD5 13f4606c48c991787dfc356153225df7
BLAKE2b-256 247b68d65ca4acef0156d6bf8f9ed1529fee85ab0269614fb8221e0c0d74c53c

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.20.0.dev20210714135018-cp37-cp37m-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.20.0.dev20210714135018-cp37-cp37m-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 7cf9bd621b0e6bb5237b1ee7af979d03bb1f971b4741d40241dead6c95778a14
MD5 ac9c29d56c20c8147890fe4568ff8e63
BLAKE2b-256 1ebcfe51b4f4950c184de10d62be2d33087ee1f3951aac49e289492c5d09632d

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.20.0.dev20210714135018-cp36-cp36m-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.20.0.dev20210714135018-cp36-cp36m-win_amd64.whl
Algorithm Hash digest
SHA256 371f70e5db3bf19b7c17aa08c10ac624ee1fca945b2224836c7ef3f8f63428a7
MD5 3ddf2debcfe550c32d7b9542f4c8c426
BLAKE2b-256 5aa5f237b6543dc44a7b871d84b3d31311c3882fb193cb4845d82a622b609a1f

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.20.0.dev20210714135018-cp36-cp36m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.20.0.dev20210714135018-cp36-cp36m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 02aaf1b830c9a9bf85e28198bbc917a6198f057bc8c0ab6f7e66a4f91d93c4a4
MD5 1d47ca9ed6ff0e1335a1b39d39b49e8f
BLAKE2b-256 93876553f5e45701e1b865429d988f1cbb8501c9898fa8c3cf85b46410f584c4

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.20.0.dev20210714135018-cp36-cp36m-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.20.0.dev20210714135018-cp36-cp36m-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 b740bc0397b5602f33c5cc2a72de7fa01d09925b0d649556515a8de5949efaf6
MD5 f796a8e916c72b3a5f4842a9dbefb62f
BLAKE2b-256 58c81da91202bbf14a4bd4c938436825a579bacc59ea5ffef30e4f03db6122cd

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page