Skip to main content

TensorFlow IO

Project description




TensorFlow I/O

GitHub CI PyPI License Documentation

TensorFlow I/O is a collection of file systems and file formats that are not available in TensorFlow's built-in support. A full list of supported file systems and file formats by TensorFlow I/O can be found here.

The use of tensorflow-io is straightforward with keras. Below is an example to Get Started with TensorFlow with the data processing aspect replaced by tensorflow-io:

import tensorflow as tf
import tensorflow_io as tfio

# Read the MNIST data into the IODataset.
dataset_url = "https://storage.googleapis.com/cvdf-datasets/mnist/"
d_train = tfio.IODataset.from_mnist(
    dataset_url + "train-images-idx3-ubyte.gz",
    dataset_url + "train-labels-idx1-ubyte.gz",
)

# Shuffle the elements of the dataset.
d_train = d_train.shuffle(buffer_size=1024)

# By default image data is uint8, so convert to float32 using map().
d_train = d_train.map(lambda x, y: (tf.image.convert_image_dtype(x, tf.float32), y))

# prepare batches the data just like any other tf.data.Dataset
d_train = d_train.batch(32)

# Build the model.
model = tf.keras.models.Sequential(
    [
        tf.keras.layers.Flatten(input_shape=(28, 28)),
        tf.keras.layers.Dense(512, activation=tf.nn.relu),
        tf.keras.layers.Dropout(0.2),
        tf.keras.layers.Dense(10, activation=tf.nn.softmax),
    ]
)

# Compile the model.
model.compile(
    optimizer="adam", loss="sparse_categorical_crossentropy", metrics=["accuracy"]
)

# Fit the model.
model.fit(d_train, epochs=5, steps_per_epoch=200)

In the above MNIST example, the URL's to access the dataset files are passed directly to the tfio.IODataset.from_mnist API call. This is due to the inherent support that tensorflow-io provides for HTTP/HTTPS file system, thus eliminating the need for downloading and saving datasets on a local directory.

NOTE: Since tensorflow-io is able to detect and uncompress the MNIST dataset automatically if needed, we can pass the URL's for the compressed files (gzip) to the API call as is.

Please check the official documentation for more detailed and interesting usages of the package.

Installation

Python Package

The tensorflow-io Python package can be installed with pip directly using:

$ pip install tensorflow-io

People who are a little more adventurous can also try our nightly binaries:

$ pip install tensorflow-io-nightly

Docker Images

In addition to the pip packages, the docker images can be used to quickly get started.

For stable builds:

$ docker pull tfsigio/tfio:latest
$ docker run -it --rm --name tfio-latest tfsigio/tfio:latest

For nightly builds:

$ docker pull tfsigio/tfio:nightly
$ docker run -it --rm --name tfio-nightly tfsigio/tfio:nightly

R Package

Once the tensorflow-io Python package has been successfully installed, you can install the development version of the R package from GitHub via the following:

if (!require("remotes")) install.packages("remotes")
remotes::install_github("tensorflow/io", subdir = "R-package")

TensorFlow Version Compatibility

To ensure compatibility with TensorFlow, it is recommended to install a matching version of TensorFlow I/O according to the table below. You can find the list of releases here.

TensorFlow I/O Version TensorFlow Compatibility Release Date
0.17.1 2.4.x Apr 16, 2021
0.17.0 2.4.x Dec 14, 2020
0.16.0 2.3.x Oct 23, 2020
0.15.0 2.3.x Aug 03, 2020
0.14.0 2.2.x Jul 08, 2020
0.13.0 2.2.x May 10, 2020
0.12.0 2.1.x Feb 28, 2020
0.11.0 2.1.x Jan 10, 2020
0.10.0 2.0.x Dec 05, 2019
0.9.1 2.0.x Nov 15, 2019
0.9.0 2.0.x Oct 18, 2019
0.8.1 1.15.x Nov 15, 2019
0.8.0 1.15.x Oct 17, 2019
0.7.2 1.14.x Nov 15, 2019
0.7.1 1.14.x Oct 18, 2019
0.7.0 1.14.x Jul 14, 2019
0.6.0 1.13.x May 29, 2019
0.5.0 1.13.x Apr 12, 2019
0.4.0 1.13.x Mar 01, 2019
0.3.0 1.12.0 Feb 15, 2019
0.2.0 1.12.0 Jan 29, 2019
0.1.0 1.12.0 Dec 16, 2018

Performance Benchmarking

We use github-pages to document the results of API performance benchmarks. The benchmark job is triggered on every commit to master branch and facilitates tracking performance w.r.t commits.

Contributing

Tensorflow I/O is a community led open source project. As such, the project depends on public contributions, bug-fixes, and documentation. Please see:

Build Status and CI

Build Status
Linux CPU Python 2 Status
Linux CPU Python 3 Status
Linux GPU Python 2 Status
Linux GPU Python 3 Status

Because of manylinux2010 requirement, TensorFlow I/O is built with Ubuntu:16.04 + Developer Toolset 7 (GCC 7.3) on Linux. Configuration with Ubuntu 16.04 with Developer Toolset 7 is not exactly straightforward. If the system have docker installed, then the following command will automatically build manylinux2010 compatible whl package:

#!/usr/bin/env bash

ls dist/*
for f in dist/*.whl; do
  docker run -i --rm -v $PWD:/v -w /v --net=host quay.io/pypa/manylinux2010_x86_64 bash -x -e /v/tools/build/auditwheel repair --plat manylinux2010_x86_64 $f
done
sudo chown -R $(id -nu):$(id -ng) .
ls wheelhouse/*

It takes some time to build, but once complete, there will be python 3.5, 3.6, 3.7 compatible whl packages available in wheelhouse directory.

On macOS, the same command could be used. However, the script expects python in shell and will only generate a whl package that matches the version of python in shell. If you want to build a whl package for a specific python then you have to alias this version of python to python in shell. See .github/workflows/build.yml Auditwheel step for instructions how to do that.

Note the above command is also the command we use when releasing packages for Linux and macOS.

TensorFlow I/O uses both GitHub Workflows and Google CI (Kokoro) for continuous integration. GitHub Workflows is used for macOS build and test. Kokoro is used for Linux build and test. Again, because of the manylinux2010 requirement, on Linux whl packages are always built with Ubuntu 16.04 + Developer Toolset 7. Tests are done on a variatiy of systems with different python3 versions to ensure a good coverage:

Python Ubuntu 18.04 Ubuntu 20.04 macOS + osx9 Windows-2019
2.7 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: N/A
3.7 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: :heavy_check_mark:
3.8 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: :heavy_check_mark:

TensorFlow I/O has integrations with many systems and cloud vendors such as Prometheus, Apache Kafka, Apache Ignite, Google Cloud PubSub, AWS Kinesis, Microsoft Azure Storage, Alibaba Cloud OSS etc.

We tried our best to test against those systems in our continuous integration whenever possible. Some tests such as Prometheus, Kafka, and Ignite are done with live systems, meaning we install Prometheus/Kafka/Ignite on CI machine before the test is run. Some tests such as Kinesis, PubSub, and Azure Storage are done through official or non-official emulators. Offline tests are also performed whenever possible, though systems covered through offine tests may not have the same level of coverage as live systems or emulators.

Live System Emulator CI Integration Offline
Apache Kafka :heavy_check_mark: :heavy_check_mark:
Apache Ignite :heavy_check_mark: :heavy_check_mark:
Prometheus :heavy_check_mark: :heavy_check_mark:
Google PubSub :heavy_check_mark: :heavy_check_mark:
Azure Storage :heavy_check_mark: :heavy_check_mark:
AWS Kinesis :heavy_check_mark: :heavy_check_mark:
Alibaba Cloud OSS :heavy_check_mark:
Google BigTable/BigQuery to be added
Elasticsearch (experimental) :heavy_check_mark: :heavy_check_mark:
MongoDB (experimental) :heavy_check_mark: :heavy_check_mark:

References for emulators:

Community

Additional Information

License

Apache License 2.0

Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

tensorflow_io_nightly-0.18.0.dev20210504011133-cp39-cp39-win_amd64.whl (20.6 MB view details)

Uploaded CPython 3.9 Windows x86-64

tensorflow_io_nightly-0.18.0.dev20210504011133-cp39-cp39-manylinux2010_x86_64.whl (24.0 MB view details)

Uploaded CPython 3.9 manylinux: glibc 2.12+ x86-64

tensorflow_io_nightly-0.18.0.dev20210504011133-cp39-cp39-macosx_10_14_x86_64.whl (21.1 MB view details)

Uploaded CPython 3.9 macOS 10.14+ x86-64

tensorflow_io_nightly-0.18.0.dev20210504011133-cp38-cp38-win_amd64.whl (20.6 MB view details)

Uploaded CPython 3.8 Windows x86-64

tensorflow_io_nightly-0.18.0.dev20210504011133-cp38-cp38-manylinux2010_x86_64.whl (24.0 MB view details)

Uploaded CPython 3.8 manylinux: glibc 2.12+ x86-64

tensorflow_io_nightly-0.18.0.dev20210504011133-cp38-cp38-macosx_10_14_x86_64.whl (21.1 MB view details)

Uploaded CPython 3.8 macOS 10.14+ x86-64

tensorflow_io_nightly-0.18.0.dev20210504011133-cp37-cp37m-win_amd64.whl (20.6 MB view details)

Uploaded CPython 3.7m Windows x86-64

tensorflow_io_nightly-0.18.0.dev20210504011133-cp37-cp37m-manylinux2010_x86_64.whl (24.0 MB view details)

Uploaded CPython 3.7m manylinux: glibc 2.12+ x86-64

tensorflow_io_nightly-0.18.0.dev20210504011133-cp37-cp37m-macosx_10_14_x86_64.whl (21.1 MB view details)

Uploaded CPython 3.7m macOS 10.14+ x86-64

tensorflow_io_nightly-0.18.0.dev20210504011133-cp36-cp36m-win_amd64.whl (20.6 MB view details)

Uploaded CPython 3.6m Windows x86-64

tensorflow_io_nightly-0.18.0.dev20210504011133-cp36-cp36m-manylinux2010_x86_64.whl (24.0 MB view details)

Uploaded CPython 3.6m manylinux: glibc 2.12+ x86-64

tensorflow_io_nightly-0.18.0.dev20210504011133-cp36-cp36m-macosx_10_14_x86_64.whl (21.1 MB view details)

Uploaded CPython 3.6m macOS 10.14+ x86-64

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210504011133-cp39-cp39-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210504011133-cp39-cp39-win_amd64.whl
Algorithm Hash digest
SHA256 24bdfd2dc65ae93603f8bbaa13c764cf8290b255c839b7e050e5eacdb6dd0190
MD5 bf41632cb54a147ffbe85b475d8444fe
BLAKE2b-256 7ca2384f32173ab5a5ce33755c163e1dd3eca5689a754005754d19a32ba9ab03

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210504011133-cp39-cp39-manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210504011133-cp39-cp39-manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 40f54ac7cb094f34124ae95327b92c57f41a1e7b8ce2bd978e0acca707648dc0
MD5 e8b5b55050e659702fa7e2acd98571e8
BLAKE2b-256 1f10229011cd13138669814b30135eae6b3ce881be38128568dec73453b86820

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210504011133-cp39-cp39-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210504011133-cp39-cp39-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 b397a2d0d7047169ecc730be607a2f0bd731410c1398463b8408df6b7b5cb9dd
MD5 960c71e83871cc38ad503be6a13812b4
BLAKE2b-256 7c0600dca0123301140bbe55222e8365a1a8e38836059fbf6b2b3473d5c82a05

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210504011133-cp38-cp38-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210504011133-cp38-cp38-win_amd64.whl
Algorithm Hash digest
SHA256 8edb1254adbe4fce96aa25f37a8dae5cd80a85379cf26ea666c566867182a524
MD5 8dd229d1dd313c2f2cf5a07bd053ef1f
BLAKE2b-256 cd7981ef2f94c3285f599a3609e3fd51b18ca7f50daea514f06b76b20e6a0497

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210504011133-cp38-cp38-manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210504011133-cp38-cp38-manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 c3b534ab1091e203ac3198151f6acf009004d70e96232305c2146e535578b6bb
MD5 4424fc270c0f667546eaa633badd6e07
BLAKE2b-256 0a6c9dfecd4716df3a92052ca96cf26b0bf52856381789f06c7c4efe185d1358

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210504011133-cp38-cp38-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210504011133-cp38-cp38-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 348b2a16bcb0312debe6694807a5e257affd09dc91c933798d7763fc2928c44b
MD5 d9c10413c096a66d01585ec7f74161aa
BLAKE2b-256 ed8c5b64398245b855a8af5a82ab487de50e798c20f36a63ccaea4fba99dfaf0

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210504011133-cp37-cp37m-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210504011133-cp37-cp37m-win_amd64.whl
Algorithm Hash digest
SHA256 c880811fe5b28fd98b42379290accbe86ee731a1a43e6e7761fc2c9e7e586225
MD5 bf938dc9dab8f5ab12262a498b946369
BLAKE2b-256 9c4a1b51fad7d54404178f34154068ea1df9c30674d7d9b2537338e068861d69

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210504011133-cp37-cp37m-manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210504011133-cp37-cp37m-manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 4bd00e7af89f32ef8910efb29d34532e0bbfb20cb91933b76388c0ca25692b54
MD5 3bf9d752b5903aff8c16bf89be9a68d5
BLAKE2b-256 6f8b05a26b533b17a0fd97cb2811c47e1991258258c9319f92e0f018c55c841d

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210504011133-cp37-cp37m-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210504011133-cp37-cp37m-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 223778ebf6415fc675e809d51384542b1f0c3e262e32e4ad338425697663bbe0
MD5 ead6cc35fa11a2e43d05aebb72cdacda
BLAKE2b-256 6a02bb7db4f7e356eaeb6f71adca6dacfc501fb812569f699e4be01a943aa0fb

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210504011133-cp36-cp36m-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210504011133-cp36-cp36m-win_amd64.whl
Algorithm Hash digest
SHA256 00ae3130711b2db53143422d575a17fa1a50ed72033591f75075426daa7d5ced
MD5 feeb721c88648a4e38e700b1bb3745ce
BLAKE2b-256 07c7ecc08b2f2d37ba552a889c1193c7c8dd5660815d2966ac24d75c0a2ebb07

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210504011133-cp36-cp36m-manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210504011133-cp36-cp36m-manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 3e1e6fb594b4929c0a2ee3ca367e165e962c3902dd7a446bd1191a6da9692bbf
MD5 e1c658c1f730c28237c57dfa53516c11
BLAKE2b-256 a829912162d7c4badd67fbecbf6b8f05677740076e5a322a41534a5569c2f66d

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210504011133-cp36-cp36m-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210504011133-cp36-cp36m-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 59d9b96e6b0edd87b4e2962bdcd5c0fb3ed888336e018c8ac061ffa8938fe95e
MD5 39979e739ac2bd3bae83b75094c39b11
BLAKE2b-256 69d4c2c992ce0d9b150294c97d12eacb8d2b90ca14190ed9107242648d6df3b3

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page