Skip to main content

TensorFlow IO

Project description




TensorFlow I/O

GitHub CI PyPI License Documentation

TensorFlow I/O is a collection of file systems and file formats that are not available in TensorFlow's built-in support. A full list of supported file systems and file formats by TensorFlow I/O can be found here.

The use of tensorflow-io is straightforward with keras. Below is an example to Get Started with TensorFlow with the data processing aspect replaced by tensorflow-io:

import tensorflow as tf
import tensorflow_io as tfio

# Read the MNIST data into the IODataset.
dataset_url = "https://storage.googleapis.com/cvdf-datasets/mnist/"
d_train = tfio.IODataset.from_mnist(
    dataset_url + "train-images-idx3-ubyte.gz",
    dataset_url + "train-labels-idx1-ubyte.gz",
)

# Shuffle the elements of the dataset.
d_train = d_train.shuffle(buffer_size=1024)

# By default image data is uint8, so convert to float32 using map().
d_train = d_train.map(lambda x, y: (tf.image.convert_image_dtype(x, tf.float32), y))

# prepare batches the data just like any other tf.data.Dataset
d_train = d_train.batch(32)

# Build the model.
model = tf.keras.models.Sequential(
    [
        tf.keras.layers.Flatten(input_shape=(28, 28)),
        tf.keras.layers.Dense(512, activation=tf.nn.relu),
        tf.keras.layers.Dropout(0.2),
        tf.keras.layers.Dense(10, activation=tf.nn.softmax),
    ]
)

# Compile the model.
model.compile(
    optimizer="adam", loss="sparse_categorical_crossentropy", metrics=["accuracy"]
)

# Fit the model.
model.fit(d_train, epochs=5, steps_per_epoch=200)

In the above MNIST example, the URL's to access the dataset files are passed directly to the tfio.IODataset.from_mnist API call. This is due to the inherent support that tensorflow-io provides for HTTP/HTTPS file system, thus eliminating the need for downloading and saving datasets on a local directory.

NOTE: Since tensorflow-io is able to detect and uncompress the MNIST dataset automatically if needed, we can pass the URL's for the compressed files (gzip) to the API call as is.

Please check the official documentation for more detailed and interesting usages of the package.

Installation

Python Package

The tensorflow-io Python package can be installed with pip directly using:

$ pip install tensorflow-io

People who are a little more adventurous can also try our nightly binaries:

$ pip install tensorflow-io-nightly

Docker Images

In addition to the pip packages, the docker images can be used to quickly get started.

For stable builds:

$ docker pull tfsigio/tfio:latest
$ docker run -it --rm --name tfio-latest tfsigio/tfio:latest

For nightly builds:

$ docker pull tfsigio/tfio:nightly
$ docker run -it --rm --name tfio-nightly tfsigio/tfio:nightly

R Package

Once the tensorflow-io Python package has been successfully installed, you can install the development version of the R package from GitHub via the following:

if (!require("remotes")) install.packages("remotes")
remotes::install_github("tensorflow/io", subdir = "R-package")

TensorFlow Version Compatibility

To ensure compatibility with TensorFlow, it is recommended to install a matching version of TensorFlow I/O according to the table below. You can find the list of releases here.

TensorFlow I/O Version TensorFlow Compatibility Release Date
0.19.0 2.5.x Jun 25, 2021
0.18.0 2.5.x May 13, 2021
0.17.1 2.4.x Apr 16, 2021
0.17.0 2.4.x Dec 14, 2020
0.16.0 2.3.x Oct 23, 2020
0.15.0 2.3.x Aug 03, 2020
0.14.0 2.2.x Jul 08, 2020
0.13.0 2.2.x May 10, 2020
0.12.0 2.1.x Feb 28, 2020
0.11.0 2.1.x Jan 10, 2020
0.10.0 2.0.x Dec 05, 2019
0.9.1 2.0.x Nov 15, 2019
0.9.0 2.0.x Oct 18, 2019
0.8.1 1.15.x Nov 15, 2019
0.8.0 1.15.x Oct 17, 2019
0.7.2 1.14.x Nov 15, 2019
0.7.1 1.14.x Oct 18, 2019
0.7.0 1.14.x Jul 14, 2019
0.6.0 1.13.x May 29, 2019
0.5.0 1.13.x Apr 12, 2019
0.4.0 1.13.x Mar 01, 2019
0.3.0 1.12.0 Feb 15, 2019
0.2.0 1.12.0 Jan 29, 2019
0.1.0 1.12.0 Dec 16, 2018

Performance Benchmarking

We use github-pages to document the results of API performance benchmarks. The benchmark job is triggered on every commit to master branch and facilitates tracking performance w.r.t commits.

Contributing

Tensorflow I/O is a community led open source project. As such, the project depends on public contributions, bug-fixes, and documentation. Please see:

Build Status and CI

Build Status
Linux CPU Python 2 Status
Linux CPU Python 3 Status
Linux GPU Python 2 Status
Linux GPU Python 3 Status

Because of manylinux2010 requirement, TensorFlow I/O is built with Ubuntu:16.04 + Developer Toolset 7 (GCC 7.3) on Linux. Configuration with Ubuntu 16.04 with Developer Toolset 7 is not exactly straightforward. If the system have docker installed, then the following command will automatically build manylinux2010 compatible whl package:

#!/usr/bin/env bash

ls dist/*
for f in dist/*.whl; do
  docker run -i --rm -v $PWD:/v -w /v --net=host quay.io/pypa/manylinux2010_x86_64 bash -x -e /v/tools/build/auditwheel repair --plat manylinux2010_x86_64 $f
done
sudo chown -R $(id -nu):$(id -ng) .
ls wheelhouse/*

It takes some time to build, but once complete, there will be python 3.5, 3.6, 3.7 compatible whl packages available in wheelhouse directory.

On macOS, the same command could be used. However, the script expects python in shell and will only generate a whl package that matches the version of python in shell. If you want to build a whl package for a specific python then you have to alias this version of python to python in shell. See .github/workflows/build.yml Auditwheel step for instructions how to do that.

Note the above command is also the command we use when releasing packages for Linux and macOS.

TensorFlow I/O uses both GitHub Workflows and Google CI (Kokoro) for continuous integration. GitHub Workflows is used for macOS build and test. Kokoro is used for Linux build and test. Again, because of the manylinux2010 requirement, on Linux whl packages are always built with Ubuntu 16.04 + Developer Toolset 7. Tests are done on a variatiy of systems with different python3 versions to ensure a good coverage:

Python Ubuntu 18.04 Ubuntu 20.04 macOS + osx9 Windows-2019
2.7 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: N/A
3.7 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: :heavy_check_mark:
3.8 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: :heavy_check_mark:

TensorFlow I/O has integrations with many systems and cloud vendors such as Prometheus, Apache Kafka, Apache Ignite, Google Cloud PubSub, AWS Kinesis, Microsoft Azure Storage, Alibaba Cloud OSS etc.

We tried our best to test against those systems in our continuous integration whenever possible. Some tests such as Prometheus, Kafka, and Ignite are done with live systems, meaning we install Prometheus/Kafka/Ignite on CI machine before the test is run. Some tests such as Kinesis, PubSub, and Azure Storage are done through official or non-official emulators. Offline tests are also performed whenever possible, though systems covered through offine tests may not have the same level of coverage as live systems or emulators.

Live System Emulator CI Integration Offline
Apache Kafka :heavy_check_mark: :heavy_check_mark:
Apache Ignite :heavy_check_mark: :heavy_check_mark:
Prometheus :heavy_check_mark: :heavy_check_mark:
Google PubSub :heavy_check_mark: :heavy_check_mark:
Azure Storage :heavy_check_mark: :heavy_check_mark:
AWS Kinesis :heavy_check_mark: :heavy_check_mark:
Alibaba Cloud OSS :heavy_check_mark:
Google BigTable/BigQuery to be added
Elasticsearch (experimental) :heavy_check_mark: :heavy_check_mark:
MongoDB (experimental) :heavy_check_mark: :heavy_check_mark:

References for emulators:

Community

Additional Information

License

Apache License 2.0

Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

tensorflow_io_nightly-0.20.0.dev20210722075142-cp39-cp39-win_amd64.whl (21.3 MB view details)

Uploaded CPython 3.9 Windows x86-64

tensorflow_io_nightly-0.20.0.dev20210722075142-cp39-cp39-macosx_10_14_x86_64.whl (22.8 MB view details)

Uploaded CPython 3.9 macOS 10.14+ x86-64

tensorflow_io_nightly-0.20.0.dev20210722075142-cp38-cp38-win_amd64.whl (21.3 MB view details)

Uploaded CPython 3.8 Windows x86-64

tensorflow_io_nightly-0.20.0.dev20210722075142-cp38-cp38-macosx_10_14_x86_64.whl (22.8 MB view details)

Uploaded CPython 3.8 macOS 10.14+ x86-64

tensorflow_io_nightly-0.20.0.dev20210722075142-cp37-cp37m-win_amd64.whl (21.3 MB view details)

Uploaded CPython 3.7m Windows x86-64

tensorflow_io_nightly-0.20.0.dev20210722075142-cp37-cp37m-macosx_10_14_x86_64.whl (22.8 MB view details)

Uploaded CPython 3.7m macOS 10.14+ x86-64

tensorflow_io_nightly-0.20.0.dev20210722075142-cp36-cp36m-win_amd64.whl (21.3 MB view details)

Uploaded CPython 3.6m Windows x86-64

tensorflow_io_nightly-0.20.0.dev20210722075142-cp36-cp36m-macosx_10_14_x86_64.whl (22.8 MB view details)

Uploaded CPython 3.6m macOS 10.14+ x86-64

File details

Details for the file tensorflow_io_nightly-0.20.0.dev20210722075142-cp39-cp39-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.20.0.dev20210722075142-cp39-cp39-win_amd64.whl
Algorithm Hash digest
SHA256 70f021ecadf3a139122a01d2ed4aec143ff4e9057bf387094adefec40dd71f89
MD5 694b503d9f1c0ea61de824104c1c2446
BLAKE2b-256 30eef0adbfedb206bc442648552956e5af5b330a6454f724c706af4bd27ae611

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.20.0.dev20210722075142-cp39-cp39-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.20.0.dev20210722075142-cp39-cp39-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 120d2861e7ac1eeaf22451917d8984b5c8403a5ea29a8e7a9ee3b3c21fd2ef29
MD5 2ab19fcb12a627c8210cb96b5aafc93c
BLAKE2b-256 4cadc3444f648cfeca9d9d56ffa996810ee5415bab04d8bf17e781763363d83a

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.20.0.dev20210722075142-cp39-cp39-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.20.0.dev20210722075142-cp39-cp39-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 668631cf3456084982d21be4df1f98dca5ab02cd935a93d898299fd4b9b8d577
MD5 084fb13a90e2db6c5635670a4374e7f0
BLAKE2b-256 acf4344a253e3030760882c54efd02f55cd9442f770e37ea508a4c3cfcc0ed3c

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.20.0.dev20210722075142-cp38-cp38-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.20.0.dev20210722075142-cp38-cp38-win_amd64.whl
Algorithm Hash digest
SHA256 08506b9249f862f0dda73bf10436b46ee558b7bc4f6cfe4f5758440fdc921da6
MD5 e5c63371fa6cdad9f167ed11c7af14f9
BLAKE2b-256 55b1de6713bffbc991cd272a2b5da2c9cc5ecb7fd2c7e1e2b7828cefdc29ed4c

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.20.0.dev20210722075142-cp38-cp38-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.20.0.dev20210722075142-cp38-cp38-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 c87386a9e385ac07ef0bd490adfb0f92716479565847c8a1878b865a8469e001
MD5 8e76c3185bb6f45abc37769267563a11
BLAKE2b-256 8c7b2412e7871c532070f612bd215d63a8f03c7b5002f5d3251509d8e6dc8fbc

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.20.0.dev20210722075142-cp38-cp38-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.20.0.dev20210722075142-cp38-cp38-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 186d4276fedcd1ac55154de4efd79267082d2eb3d3c442d90763f26f62911448
MD5 35ca895569357d0b034e12aab2cf12d0
BLAKE2b-256 d9a18b901d408b32490223bc93dc37c053ecb4cd11339101077f394019dfd7f8

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.20.0.dev20210722075142-cp37-cp37m-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.20.0.dev20210722075142-cp37-cp37m-win_amd64.whl
Algorithm Hash digest
SHA256 6959b8c18f4e57839dc044c230a721e18017812ffa874890b1abec4c5f59d82a
MD5 ca4c5fb5476925382c19176e1e18d0ff
BLAKE2b-256 f78647d2d8b7c130c2e74c0371eae1056d8c21079eaa270b1b4535107fb986c7

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.20.0.dev20210722075142-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.20.0.dev20210722075142-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 9066db827cd3443c78bda26d9540275367a97e517e2ef74ea67e68447c4dc60f
MD5 36bfaf7a7d5f6c626ac1f5c744a25f1b
BLAKE2b-256 944cb0678d4c50beb503367adb72f489218e33f70577f3b57b016352b9861a9e

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.20.0.dev20210722075142-cp37-cp37m-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.20.0.dev20210722075142-cp37-cp37m-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 070bc60c394f5be846302ea4aa060ea7c3e218ff5025ca8459a0ee36791dc780
MD5 cb82fe0edc54c1c7b648d960ce5f2549
BLAKE2b-256 8796ea34602d2a74a27478809efb30efe02f229b8e639dabd01b4593249bfc68

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.20.0.dev20210722075142-cp36-cp36m-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.20.0.dev20210722075142-cp36-cp36m-win_amd64.whl
Algorithm Hash digest
SHA256 3ca1c7a03eddc5bb200862c310286ba927acd9bdad560190ad4db62067156fb3
MD5 28b6235a098c2002fef1dbb871b4b2a5
BLAKE2b-256 c59733635ad09cad285fc5220dfe449d13e95c6fd906f9c55717bf8258d86ebd

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.20.0.dev20210722075142-cp36-cp36m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.20.0.dev20210722075142-cp36-cp36m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 87fd837e6a69ef768a413105f9d97415620ba4ef2b60181e32b90d1c226f3542
MD5 cad619352779ac24c26a0f47c241845d
BLAKE2b-256 8c0a47c79fd758acf2bc498728d1aff07d0ec7aa2b7586edd4a6481e386f1ee5

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.20.0.dev20210722075142-cp36-cp36m-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.20.0.dev20210722075142-cp36-cp36m-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 d17881fbab0de572d814a3a69fa5a76793343e7b4b5c344b65e8a851b7e541e2
MD5 8dc7fbed6aa231262496b5d14b320d92
BLAKE2b-256 48c4bf7e5044a437018c6c5aed318b3f31b1477ca1898a9387d27e69b0d8bcc1

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page