Skip to main content

TensorFlow IO

Project description




TensorFlow I/O

GitHub CI PyPI License Documentation

TensorFlow I/O is a collection of file systems and file formats that are not available in TensorFlow's built-in support. A full list of supported file systems and file formats by TensorFlow I/O can be found here.

The use of tensorflow-io is straightforward with keras. Below is an example to Get Started with TensorFlow with the data processing aspect replaced by tensorflow-io:

import tensorflow as tf
import tensorflow_io as tfio

# Read the MNIST data into the IODataset.
dataset_url = "https://storage.googleapis.com/cvdf-datasets/mnist/"
d_train = tfio.IODataset.from_mnist(
    dataset_url + "train-images-idx3-ubyte.gz",
    dataset_url + "train-labels-idx1-ubyte.gz",
)

# Shuffle the elements of the dataset.
d_train = d_train.shuffle(buffer_size=1024)

# By default image data is uint8, so convert to float32 using map().
d_train = d_train.map(lambda x, y: (tf.image.convert_image_dtype(x, tf.float32), y))

# prepare batches the data just like any other tf.data.Dataset
d_train = d_train.batch(32)

# Build the model.
model = tf.keras.models.Sequential(
    [
        tf.keras.layers.Flatten(input_shape=(28, 28)),
        tf.keras.layers.Dense(512, activation=tf.nn.relu),
        tf.keras.layers.Dropout(0.2),
        tf.keras.layers.Dense(10, activation=tf.nn.softmax),
    ]
)

# Compile the model.
model.compile(
    optimizer="adam", loss="sparse_categorical_crossentropy", metrics=["accuracy"]
)

# Fit the model.
model.fit(d_train, epochs=5, steps_per_epoch=200)

In the above MNIST example, the URL's to access the dataset files are passed directly to the tfio.IODataset.from_mnist API call. This is due to the inherent support that tensorflow-io provides for HTTP/HTTPS file system, thus eliminating the need for downloading and saving datasets on a local directory.

NOTE: Since tensorflow-io is able to detect and uncompress the MNIST dataset automatically if needed, we can pass the URL's for the compressed files (gzip) to the API call as is.

Please check the official documentation for more detailed and interesting usages of the package.

Installation

Python Package

The tensorflow-io Python package can be installed with pip directly using:

$ pip install tensorflow-io

People who are a little more adventurous can also try our nightly binaries:

$ pip install tensorflow-io-nightly

Docker Images

In addition to the pip packages, the docker images can be used to quickly get started.

For stable builds:

$ docker pull tfsigio/tfio:latest
$ docker run -it --rm --name tfio-latest tfsigio/tfio:latest

For nightly builds:

$ docker pull tfsigio/tfio:nightly
$ docker run -it --rm --name tfio-nightly tfsigio/tfio:nightly

R Package

Once the tensorflow-io Python package has been successfully installed, you can install the development version of the R package from GitHub via the following:

if (!require("remotes")) install.packages("remotes")
remotes::install_github("tensorflow/io", subdir = "R-package")

TensorFlow Version Compatibility

To ensure compatibility with TensorFlow, it is recommended to install a matching version of TensorFlow I/O according to the table below. You can find the list of releases here.

TensorFlow I/O Version TensorFlow Compatibility Release Date
0.18.0 2.5.x May 13, 2021
0.17.1 2.4.x Apr 16, 2021
0.17.0 2.4.x Dec 14, 2020
0.16.0 2.3.x Oct 23, 2020
0.15.0 2.3.x Aug 03, 2020
0.14.0 2.2.x Jul 08, 2020
0.13.0 2.2.x May 10, 2020
0.12.0 2.1.x Feb 28, 2020
0.11.0 2.1.x Jan 10, 2020
0.10.0 2.0.x Dec 05, 2019
0.9.1 2.0.x Nov 15, 2019
0.9.0 2.0.x Oct 18, 2019
0.8.1 1.15.x Nov 15, 2019
0.8.0 1.15.x Oct 17, 2019
0.7.2 1.14.x Nov 15, 2019
0.7.1 1.14.x Oct 18, 2019
0.7.0 1.14.x Jul 14, 2019
0.6.0 1.13.x May 29, 2019
0.5.0 1.13.x Apr 12, 2019
0.4.0 1.13.x Mar 01, 2019
0.3.0 1.12.0 Feb 15, 2019
0.2.0 1.12.0 Jan 29, 2019
0.1.0 1.12.0 Dec 16, 2018

Performance Benchmarking

We use github-pages to document the results of API performance benchmarks. The benchmark job is triggered on every commit to master branch and facilitates tracking performance w.r.t commits.

Contributing

Tensorflow I/O is a community led open source project. As such, the project depends on public contributions, bug-fixes, and documentation. Please see:

Build Status and CI

Build Status
Linux CPU Python 2 Status
Linux CPU Python 3 Status
Linux GPU Python 2 Status
Linux GPU Python 3 Status

Because of manylinux2010 requirement, TensorFlow I/O is built with Ubuntu:16.04 + Developer Toolset 7 (GCC 7.3) on Linux. Configuration with Ubuntu 16.04 with Developer Toolset 7 is not exactly straightforward. If the system have docker installed, then the following command will automatically build manylinux2010 compatible whl package:

#!/usr/bin/env bash

ls dist/*
for f in dist/*.whl; do
  docker run -i --rm -v $PWD:/v -w /v --net=host quay.io/pypa/manylinux2010_x86_64 bash -x -e /v/tools/build/auditwheel repair --plat manylinux2010_x86_64 $f
done
sudo chown -R $(id -nu):$(id -ng) .
ls wheelhouse/*

It takes some time to build, but once complete, there will be python 3.5, 3.6, 3.7 compatible whl packages available in wheelhouse directory.

On macOS, the same command could be used. However, the script expects python in shell and will only generate a whl package that matches the version of python in shell. If you want to build a whl package for a specific python then you have to alias this version of python to python in shell. See .github/workflows/build.yml Auditwheel step for instructions how to do that.

Note the above command is also the command we use when releasing packages for Linux and macOS.

TensorFlow I/O uses both GitHub Workflows and Google CI (Kokoro) for continuous integration. GitHub Workflows is used for macOS build and test. Kokoro is used for Linux build and test. Again, because of the manylinux2010 requirement, on Linux whl packages are always built with Ubuntu 16.04 + Developer Toolset 7. Tests are done on a variatiy of systems with different python3 versions to ensure a good coverage:

Python Ubuntu 18.04 Ubuntu 20.04 macOS + osx9 Windows-2019
2.7 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: N/A
3.7 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: :heavy_check_mark:
3.8 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: :heavy_check_mark:

TensorFlow I/O has integrations with many systems and cloud vendors such as Prometheus, Apache Kafka, Apache Ignite, Google Cloud PubSub, AWS Kinesis, Microsoft Azure Storage, Alibaba Cloud OSS etc.

We tried our best to test against those systems in our continuous integration whenever possible. Some tests such as Prometheus, Kafka, and Ignite are done with live systems, meaning we install Prometheus/Kafka/Ignite on CI machine before the test is run. Some tests such as Kinesis, PubSub, and Azure Storage are done through official or non-official emulators. Offline tests are also performed whenever possible, though systems covered through offine tests may not have the same level of coverage as live systems or emulators.

Live System Emulator CI Integration Offline
Apache Kafka :heavy_check_mark: :heavy_check_mark:
Apache Ignite :heavy_check_mark: :heavy_check_mark:
Prometheus :heavy_check_mark: :heavy_check_mark:
Google PubSub :heavy_check_mark: :heavy_check_mark:
Azure Storage :heavy_check_mark: :heavy_check_mark:
AWS Kinesis :heavy_check_mark: :heavy_check_mark:
Alibaba Cloud OSS :heavy_check_mark:
Google BigTable/BigQuery to be added
Elasticsearch (experimental) :heavy_check_mark: :heavy_check_mark:
MongoDB (experimental) :heavy_check_mark: :heavy_check_mark:

References for emulators:

Community

Additional Information

License

Apache License 2.0

Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

tensorflow_io_nightly-0.18.0.dev20210603171642-cp39-cp39-win_amd64.whl (21.1 MB view details)

Uploaded CPython 3.9 Windows x86-64

tensorflow_io_nightly-0.18.0.dev20210603171642-cp39-cp39-macosx_10_14_x86_64.whl (22.7 MB view details)

Uploaded CPython 3.9 macOS 10.14+ x86-64

tensorflow_io_nightly-0.18.0.dev20210603171642-cp38-cp38-win_amd64.whl (21.1 MB view details)

Uploaded CPython 3.8 Windows x86-64

tensorflow_io_nightly-0.18.0.dev20210603171642-cp38-cp38-macosx_10_14_x86_64.whl (22.7 MB view details)

Uploaded CPython 3.8 macOS 10.14+ x86-64

tensorflow_io_nightly-0.18.0.dev20210603171642-cp37-cp37m-win_amd64.whl (21.1 MB view details)

Uploaded CPython 3.7m Windows x86-64

tensorflow_io_nightly-0.18.0.dev20210603171642-cp37-cp37m-macosx_10_14_x86_64.whl (22.7 MB view details)

Uploaded CPython 3.7m macOS 10.14+ x86-64

tensorflow_io_nightly-0.18.0.dev20210603171642-cp36-cp36m-win_amd64.whl (21.1 MB view details)

Uploaded CPython 3.6m Windows x86-64

tensorflow_io_nightly-0.18.0.dev20210603171642-cp36-cp36m-macosx_10_14_x86_64.whl (22.7 MB view details)

Uploaded CPython 3.6m macOS 10.14+ x86-64

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210603171642-cp39-cp39-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210603171642-cp39-cp39-win_amd64.whl
Algorithm Hash digest
SHA256 a355e979e6b8e24909d2eaf2f0d861cb0c5b078764687e083172619e982afcd6
MD5 6118f8eede6cf526e414a8636bf53013
BLAKE2b-256 5a8085ad7b774b690d223128940178088a36721fb1e573a12db27b912d0f2897

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210603171642-cp39-cp39-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210603171642-cp39-cp39-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 59f2307441096a340d1bf02438741e82552fa234032b8c2bfb054c6c3408b143
MD5 56282b1973935cd501c4a67ff5310127
BLAKE2b-256 df4037f5b008f77e0fd59378a65361aebea83d1b852a7bbba8d3db114bfb8efd

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210603171642-cp39-cp39-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210603171642-cp39-cp39-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 25f0298b14729512d3aa45daaaa5169c7b2e95871a4b96226e093ad8a04081fd
MD5 e1a2536b6a6490014ab3d3bd73c67904
BLAKE2b-256 4f65ec62cc99e36634e58b2d99883ae6a86504e0f81f23a9cc963cfb3da07928

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210603171642-cp38-cp38-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210603171642-cp38-cp38-win_amd64.whl
Algorithm Hash digest
SHA256 0714208363ab8dd949c5ad916887a695ce5cb8f3321a580e75230297e190f98e
MD5 85270cbe36b6e9b1aecc625710d8fd57
BLAKE2b-256 0bb8b905d1a0eecb477742ae9d55bd9f81810371e088f2cd07e6790501366318

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210603171642-cp38-cp38-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210603171642-cp38-cp38-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 53f1bc49679f2d30115f4079a0768bc3a79321913f77bf2ca6380d25de4bdbe1
MD5 fd69b7e26765e1718e24103627c8d34a
BLAKE2b-256 8ad887c7ecf1b848d067106c9933b321ddcf9c00ef9a0d3e3e8e1380e7b02566

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210603171642-cp38-cp38-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210603171642-cp38-cp38-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 1215b99e7a3318dee30edf64e77daadeead6ada9ba5a7d3d44ee641752d489f9
MD5 204df89b6c71bcb18c38b0120eb57913
BLAKE2b-256 a2500370e8d946894f744d8c880180f5849f5369cbe105ecc0e963c695a78289

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210603171642-cp37-cp37m-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210603171642-cp37-cp37m-win_amd64.whl
Algorithm Hash digest
SHA256 a0153f4bb70fde5f07d03c85f3932c4be9c627a1a3ce4ba0965f5d81ccaf785e
MD5 6b236b45981a55b82ad75816f7a6f5da
BLAKE2b-256 5e61aca6be00ead431446cd61bdbdbf119609893734a0946793a6b0a62e8f3b3

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210603171642-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210603171642-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 9ee0ca400bf4172a8eecbf6dc7fb64089b6025587147165ec7343b0490bc745b
MD5 85ea4b9ff0f53ae941322e3220ef4a1d
BLAKE2b-256 8c705130cac442eb04032c3991c12301dcf6c6ba02ec5fff97e4122d9d18d865

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210603171642-cp37-cp37m-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210603171642-cp37-cp37m-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 0aebe5594dcc05cb5d5d4555001f9ef4be05da3f26989d5d4982be23cd71ab92
MD5 c0631bfe568c9d4ea4cd4f9ed6e1e6ef
BLAKE2b-256 69d35e09a2f0f6298d86a1d32caf67f5fc06df0915a3cc7e9bb8a48e7fc85bdb

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210603171642-cp36-cp36m-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210603171642-cp36-cp36m-win_amd64.whl
Algorithm Hash digest
SHA256 beadfb1081b148facf00625b0d167c91a02c1ed6c20a21f5f3b1dd4b9d992de0
MD5 b3356ffce8b60a4389df40ac1cfd2c42
BLAKE2b-256 6903919a595fa239958f1332ecc6cd94b2480c1093557b42fb6bd93b35c16add

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210603171642-cp36-cp36m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210603171642-cp36-cp36m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 9212ecb5b8539c56aff7d64a00839098091ab1004e1b2116df181e7c2067bb81
MD5 d34830a55291e9792391e9629eb1ea70
BLAKE2b-256 d65e2c0def416fbe23dad6aaa5fcad02359c2586228c088f281d4dd0b3ed5a9d

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210603171642-cp36-cp36m-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210603171642-cp36-cp36m-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 74c9b8d43319ba73ca25e9f24d3f58910d1769770d91d3db479e950ab43eff2e
MD5 a27307747411ceb6b34f6ef09698aa8a
BLAKE2b-256 46cf4dbbb054d587a504a59e790f1e6cd32ca6bef3dc342e2d2f8c721260eec1

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page