Skip to main content

TensorFlow IO

Project description




TensorFlow I/O

GitHub CI PyPI License Documentation

TensorFlow I/O is a collection of file systems and file formats that are not available in TensorFlow's built-in support. A full list of supported file systems and file formats by TensorFlow I/O can be found here.

The use of tensorflow-io is straightforward with keras. Below is an example to Get Started with TensorFlow with the data processing aspect replaced by tensorflow-io:

import tensorflow as tf
import tensorflow_io as tfio

# Read the MNIST data into the IODataset.
dataset_url = "https://storage.googleapis.com/cvdf-datasets/mnist/"
d_train = tfio.IODataset.from_mnist(
    dataset_url + "train-images-idx3-ubyte.gz",
    dataset_url + "train-labels-idx1-ubyte.gz",
)

# Shuffle the elements of the dataset.
d_train = d_train.shuffle(buffer_size=1024)

# By default image data is uint8, so convert to float32 using map().
d_train = d_train.map(lambda x, y: (tf.image.convert_image_dtype(x, tf.float32), y))

# prepare batches the data just like any other tf.data.Dataset
d_train = d_train.batch(32)

# Build the model.
model = tf.keras.models.Sequential(
    [
        tf.keras.layers.Flatten(input_shape=(28, 28)),
        tf.keras.layers.Dense(512, activation=tf.nn.relu),
        tf.keras.layers.Dropout(0.2),
        tf.keras.layers.Dense(10, activation=tf.nn.softmax),
    ]
)

# Compile the model.
model.compile(
    optimizer="adam", loss="sparse_categorical_crossentropy", metrics=["accuracy"]
)

# Fit the model.
model.fit(d_train, epochs=5, steps_per_epoch=200)

In the above MNIST example, the URL's to access the dataset files are passed directly to the tfio.IODataset.from_mnist API call. This is due to the inherent support that tensorflow-io provides for HTTP/HTTPS file system, thus eliminating the need for downloading and saving datasets on a local directory.

NOTE: Since tensorflow-io is able to detect and uncompress the MNIST dataset automatically if needed, we can pass the URL's for the compressed files (gzip) to the API call as is.

Please check the official documentation for more detailed and interesting usages of the package.

Installation

Python Package

The tensorflow-io Python package can be installed with pip directly using:

$ pip install tensorflow-io

People who are a little more adventurous can also try our nightly binaries:

$ pip install tensorflow-io-nightly

Docker Images

In addition to the pip packages, the docker images can be used to quickly get started.

For stable builds:

$ docker pull tfsigio/tfio:latest
$ docker run -it --rm --name tfio-latest tfsigio/tfio:latest

For nightly builds:

$ docker pull tfsigio/tfio:nightly
$ docker run -it --rm --name tfio-nightly tfsigio/tfio:nightly

R Package

Once the tensorflow-io Python package has been successfully installed, you can install the development version of the R package from GitHub via the following:

if (!require("remotes")) install.packages("remotes")
remotes::install_github("tensorflow/io", subdir = "R-package")

TensorFlow Version Compatibility

To ensure compatibility with TensorFlow, it is recommended to install a matching version of TensorFlow I/O according to the table below. You can find the list of releases here.

TensorFlow I/O Version TensorFlow Compatibility Release Date
0.20.0 2.6.x Aug 11, 2021
0.19.1 2.5.x Jul 25, 2021
0.19.0 2.5.x Jun 25, 2021
0.18.0 2.5.x May 13, 2021
0.17.1 2.4.x Apr 16, 2021
0.17.0 2.4.x Dec 14, 2020
0.16.0 2.3.x Oct 23, 2020
0.15.0 2.3.x Aug 03, 2020
0.14.0 2.2.x Jul 08, 2020
0.13.0 2.2.x May 10, 2020
0.12.0 2.1.x Feb 28, 2020
0.11.0 2.1.x Jan 10, 2020
0.10.0 2.0.x Dec 05, 2019
0.9.1 2.0.x Nov 15, 2019
0.9.0 2.0.x Oct 18, 2019
0.8.1 1.15.x Nov 15, 2019
0.8.0 1.15.x Oct 17, 2019
0.7.2 1.14.x Nov 15, 2019
0.7.1 1.14.x Oct 18, 2019
0.7.0 1.14.x Jul 14, 2019
0.6.0 1.13.x May 29, 2019
0.5.0 1.13.x Apr 12, 2019
0.4.0 1.13.x Mar 01, 2019
0.3.0 1.12.0 Feb 15, 2019
0.2.0 1.12.0 Jan 29, 2019
0.1.0 1.12.0 Dec 16, 2018

Performance Benchmarking

We use github-pages to document the results of API performance benchmarks. The benchmark job is triggered on every commit to master branch and facilitates tracking performance w.r.t commits.

Contributing

Tensorflow I/O is a community led open source project. As such, the project depends on public contributions, bug-fixes, and documentation. Please see:

Build Status and CI

Build Status
Linux CPU Python 2 Status
Linux CPU Python 3 Status
Linux GPU Python 2 Status
Linux GPU Python 3 Status

Because of manylinux2010 requirement, TensorFlow I/O is built with Ubuntu:16.04 + Developer Toolset 7 (GCC 7.3) on Linux. Configuration with Ubuntu 16.04 with Developer Toolset 7 is not exactly straightforward. If the system have docker installed, then the following command will automatically build manylinux2010 compatible whl package:

#!/usr/bin/env bash

ls dist/*
for f in dist/*.whl; do
  docker run -i --rm -v $PWD:/v -w /v --net=host quay.io/pypa/manylinux2010_x86_64 bash -x -e /v/tools/build/auditwheel repair --plat manylinux2010_x86_64 $f
done
sudo chown -R $(id -nu):$(id -ng) .
ls wheelhouse/*

It takes some time to build, but once complete, there will be python 3.5, 3.6, 3.7 compatible whl packages available in wheelhouse directory.

On macOS, the same command could be used. However, the script expects python in shell and will only generate a whl package that matches the version of python in shell. If you want to build a whl package for a specific python then you have to alias this version of python to python in shell. See .github/workflows/build.yml Auditwheel step for instructions how to do that.

Note the above command is also the command we use when releasing packages for Linux and macOS.

TensorFlow I/O uses both GitHub Workflows and Google CI (Kokoro) for continuous integration. GitHub Workflows is used for macOS build and test. Kokoro is used for Linux build and test. Again, because of the manylinux2010 requirement, on Linux whl packages are always built with Ubuntu 16.04 + Developer Toolset 7. Tests are done on a variatiy of systems with different python3 versions to ensure a good coverage:

Python Ubuntu 18.04 Ubuntu 20.04 macOS + osx9 Windows-2019
2.7 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: N/A
3.7 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: :heavy_check_mark:
3.8 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: :heavy_check_mark:

TensorFlow I/O has integrations with many systems and cloud vendors such as Prometheus, Apache Kafka, Apache Ignite, Google Cloud PubSub, AWS Kinesis, Microsoft Azure Storage, Alibaba Cloud OSS etc.

We tried our best to test against those systems in our continuous integration whenever possible. Some tests such as Prometheus, Kafka, and Ignite are done with live systems, meaning we install Prometheus/Kafka/Ignite on CI machine before the test is run. Some tests such as Kinesis, PubSub, and Azure Storage are done through official or non-official emulators. Offline tests are also performed whenever possible, though systems covered through offine tests may not have the same level of coverage as live systems or emulators.

Live System Emulator CI Integration Offline
Apache Kafka :heavy_check_mark: :heavy_check_mark:
Apache Ignite :heavy_check_mark: :heavy_check_mark:
Prometheus :heavy_check_mark: :heavy_check_mark:
Google PubSub :heavy_check_mark: :heavy_check_mark:
Azure Storage :heavy_check_mark: :heavy_check_mark:
AWS Kinesis :heavy_check_mark: :heavy_check_mark:
Alibaba Cloud OSS :heavy_check_mark:
Google BigTable/BigQuery to be added
Elasticsearch (experimental) :heavy_check_mark: :heavy_check_mark:
MongoDB (experimental) :heavy_check_mark: :heavy_check_mark:

References for emulators:

Community

Additional Information

License

Apache License 2.0

Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

tensorflow_io_nightly-0.20.0.dev20210825201203-cp39-cp39-win_amd64.whl (21.3 MB view details)

Uploaded CPython 3.9 Windows x86-64

tensorflow_io_nightly-0.20.0.dev20210825201203-cp39-cp39-macosx_10_14_x86_64.whl (22.8 MB view details)

Uploaded CPython 3.9 macOS 10.14+ x86-64

tensorflow_io_nightly-0.20.0.dev20210825201203-cp38-cp38-win_amd64.whl (21.3 MB view details)

Uploaded CPython 3.8 Windows x86-64

tensorflow_io_nightly-0.20.0.dev20210825201203-cp38-cp38-macosx_10_14_x86_64.whl (22.8 MB view details)

Uploaded CPython 3.8 macOS 10.14+ x86-64

tensorflow_io_nightly-0.20.0.dev20210825201203-cp37-cp37m-win_amd64.whl (21.3 MB view details)

Uploaded CPython 3.7m Windows x86-64

tensorflow_io_nightly-0.20.0.dev20210825201203-cp37-cp37m-macosx_10_14_x86_64.whl (22.8 MB view details)

Uploaded CPython 3.7m macOS 10.14+ x86-64

tensorflow_io_nightly-0.20.0.dev20210825201203-cp36-cp36m-win_amd64.whl (21.3 MB view details)

Uploaded CPython 3.6m Windows x86-64

tensorflow_io_nightly-0.20.0.dev20210825201203-cp36-cp36m-macosx_10_14_x86_64.whl (22.8 MB view details)

Uploaded CPython 3.6m macOS 10.14+ x86-64

File details

Details for the file tensorflow_io_nightly-0.20.0.dev20210825201203-cp39-cp39-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.20.0.dev20210825201203-cp39-cp39-win_amd64.whl
Algorithm Hash digest
SHA256 2c47aec1e908fde6392767955023498ebf86ec1911a9973a64a834398e9cbbee
MD5 dfe03e44f4a82cd7ad642d01a7f1b59c
BLAKE2b-256 e5159770ed7cf87f7439a9b583473344ea0c1328b682a463ee7965a059068d07

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.20.0.dev20210825201203-cp39-cp39-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.20.0.dev20210825201203-cp39-cp39-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 e8676b7232b2fe588b952e61a96ec60e65a7e52bb87b17a71393a0d6a0c9e7bb
MD5 3bd9b3332152720bebedc4c5bfa43fa8
BLAKE2b-256 0268e4189ba638c66cc2494db9182e0a08f1cb0e634204d652cd06ab2fdbefd9

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.20.0.dev20210825201203-cp39-cp39-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.20.0.dev20210825201203-cp39-cp39-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 b1cb841dcd9bc9e52b82735bef2ad246baffd93c7a3110a3de84e32c169c0c96
MD5 185d26c560b5e12c1421c6f6744dbac1
BLAKE2b-256 98e7398015d786d5de8b6dbf8a60816b831e17d540a473fc7ab48122cbb75895

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.20.0.dev20210825201203-cp38-cp38-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.20.0.dev20210825201203-cp38-cp38-win_amd64.whl
Algorithm Hash digest
SHA256 eecb9c7deb391096e067709802e1885d525ee6e81ea2de8d89c0405740c546a1
MD5 31236db518c1b66f54c1eba57fe6b8be
BLAKE2b-256 d004898c0e7a5d1c85a74f3b08af63ed75c7863947dfc208dc5c2878a950614d

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.20.0.dev20210825201203-cp38-cp38-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.20.0.dev20210825201203-cp38-cp38-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 998625d602ccb5b8e25118a64850419c0444d6c3627784a807880bf6c6b808c1
MD5 f4889fded09be734c621bd73e030aa7a
BLAKE2b-256 055f9633d9279318be1511f3440dfd8227f236b0b8b491ed468b98ed9305b687

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.20.0.dev20210825201203-cp38-cp38-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.20.0.dev20210825201203-cp38-cp38-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 9c19a6cf60f8a2bf98cedf23a29bcfdb191176d9fd9ce7b1a8e81bd8b68d66a3
MD5 4d8023c7398eab870d88d052f2f05458
BLAKE2b-256 b8a42c027ed26da9765cb9fbb832e8cea082135aa8c0866d93f83f03852f87b0

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.20.0.dev20210825201203-cp37-cp37m-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.20.0.dev20210825201203-cp37-cp37m-win_amd64.whl
Algorithm Hash digest
SHA256 89048540f7603cc5a680987bdd4e8ef42b0f665648b3b16b66c14e172702c9a7
MD5 96fc15c6c62f89f5ccb1eff571e83b46
BLAKE2b-256 ca7fcaa59d726a66c12eba0d119559da1eba7f2537659b31d84f1d783983764f

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.20.0.dev20210825201203-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.20.0.dev20210825201203-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 6c2fa108f251d633fcefd5f4bbd3b8b63c3df904c275ee6ab1962169f7abe942
MD5 408c848336b4fa9b027f016b67e20850
BLAKE2b-256 e001d9daf4029eecbf932505ee6c3abf671a8e0e2e077863fd0de8052f469670

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.20.0.dev20210825201203-cp37-cp37m-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.20.0.dev20210825201203-cp37-cp37m-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 2f0fdd1d5a30fdb1a16200a68397994ee4a1da4586431b3e63a93008d9a53667
MD5 ae08cc368e59badc118b1afd1609029a
BLAKE2b-256 0448843505a3574b0015e32a40ccb96fc1fbabe78ed1d346e83a7dcf6778c97d

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.20.0.dev20210825201203-cp36-cp36m-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.20.0.dev20210825201203-cp36-cp36m-win_amd64.whl
Algorithm Hash digest
SHA256 05c26ac214f8f7871d5827df2aa56fc67a6639dec1189add93e7dc510d68c0c6
MD5 17612b0865cf517bf3bdd57cc42d8ad6
BLAKE2b-256 a2ca93081adf6ffed91e4ef437fad42d73a6cf9d8c60b0736a19fe398eaf74e7

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.20.0.dev20210825201203-cp36-cp36m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.20.0.dev20210825201203-cp36-cp36m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 8eac92f5d88b964c4bfee7dfbdeff28f59cea960eff888c5e49bf421cea390c4
MD5 aec0ae1a99349488632c92b810f3596c
BLAKE2b-256 1637a9b506027678f7c44736fc8887d4604f1315cb2b4c5e13916868174e41ef

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.20.0.dev20210825201203-cp36-cp36m-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.20.0.dev20210825201203-cp36-cp36m-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 adf27aaad1800ebbc9efb3c33f4be0685f82d16a26c93b154253f27794819fbc
MD5 3c8ee93f9a0c9ed175106119f54cef58
BLAKE2b-256 7644a90153c580402b841a3b4bdf7f3dd8a11e3fa00a6c48d148dbc8d5cb7efb

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page