Skip to main content

TensorFlow IO

Project description




TensorFlow I/O

GitHub CI PyPI License Documentation

TensorFlow I/O is a collection of file systems and file formats that are not available in TensorFlow's built-in support. A full list of supported file systems and file formats by TensorFlow I/O can be found here.

The use of tensorflow-io is straightforward with keras. Below is an example to Get Started with TensorFlow with the data processing aspect replaced by tensorflow-io:

import tensorflow as tf
import tensorflow_io as tfio

# Read the MNIST data into the IODataset.
dataset_url = "https://storage.googleapis.com/cvdf-datasets/mnist/"
d_train = tfio.IODataset.from_mnist(
    dataset_url + "train-images-idx3-ubyte.gz",
    dataset_url + "train-labels-idx1-ubyte.gz",
)

# Shuffle the elements of the dataset.
d_train = d_train.shuffle(buffer_size=1024)

# By default image data is uint8, so convert to float32 using map().
d_train = d_train.map(lambda x, y: (tf.image.convert_image_dtype(x, tf.float32), y))

# prepare batches the data just like any other tf.data.Dataset
d_train = d_train.batch(32)

# Build the model.
model = tf.keras.models.Sequential(
    [
        tf.keras.layers.Flatten(input_shape=(28, 28)),
        tf.keras.layers.Dense(512, activation=tf.nn.relu),
        tf.keras.layers.Dropout(0.2),
        tf.keras.layers.Dense(10, activation=tf.nn.softmax),
    ]
)

# Compile the model.
model.compile(
    optimizer="adam", loss="sparse_categorical_crossentropy", metrics=["accuracy"]
)

# Fit the model.
model.fit(d_train, epochs=5, steps_per_epoch=200)

In the above MNIST example, the URL's to access the dataset files are passed directly to the tfio.IODataset.from_mnist API call. This is due to the inherent support that tensorflow-io provides for HTTP/HTTPS file system, thus eliminating the need for downloading and saving datasets on a local directory.

NOTE: Since tensorflow-io is able to detect and uncompress the MNIST dataset automatically if needed, we can pass the URL's for the compressed files (gzip) to the API call as is.

Please check the official documentation for more detailed and interesting usages of the package.

Installation

Python Package

The tensorflow-io Python package can be installed with pip directly using:

$ pip install tensorflow-io

People who are a little more adventurous can also try our nightly binaries:

$ pip install tensorflow-io-nightly

Docker Images

In addition to the pip packages, the docker images can be used to quickly get started.

For stable builds:

$ docker pull tfsigio/tfio:latest
$ docker run -it --rm --name tfio-latest tfsigio/tfio:latest

For nightly builds:

$ docker pull tfsigio/tfio:nightly
$ docker run -it --rm --name tfio-nightly tfsigio/tfio:nightly

R Package

Once the tensorflow-io Python package has been successfully installed, you can install the development version of the R package from GitHub via the following:

if (!require("remotes")) install.packages("remotes")
remotes::install_github("tensorflow/io", subdir = "R-package")

TensorFlow Version Compatibility

To ensure compatibility with TensorFlow, it is recommended to install a matching version of TensorFlow I/O according to the table below. You can find the list of releases here.

TensorFlow I/O Version TensorFlow Compatibility Release Date
0.20.0 2.6.x Aug 11, 2021
0.19.1 2.5.x Jul 25, 2021
0.19.0 2.5.x Jun 25, 2021
0.18.0 2.5.x May 13, 2021
0.17.1 2.4.x Apr 16, 2021
0.17.0 2.4.x Dec 14, 2020
0.16.0 2.3.x Oct 23, 2020
0.15.0 2.3.x Aug 03, 2020
0.14.0 2.2.x Jul 08, 2020
0.13.0 2.2.x May 10, 2020
0.12.0 2.1.x Feb 28, 2020
0.11.0 2.1.x Jan 10, 2020
0.10.0 2.0.x Dec 05, 2019
0.9.1 2.0.x Nov 15, 2019
0.9.0 2.0.x Oct 18, 2019
0.8.1 1.15.x Nov 15, 2019
0.8.0 1.15.x Oct 17, 2019
0.7.2 1.14.x Nov 15, 2019
0.7.1 1.14.x Oct 18, 2019
0.7.0 1.14.x Jul 14, 2019
0.6.0 1.13.x May 29, 2019
0.5.0 1.13.x Apr 12, 2019
0.4.0 1.13.x Mar 01, 2019
0.3.0 1.12.0 Feb 15, 2019
0.2.0 1.12.0 Jan 29, 2019
0.1.0 1.12.0 Dec 16, 2018

Performance Benchmarking

We use github-pages to document the results of API performance benchmarks. The benchmark job is triggered on every commit to master branch and facilitates tracking performance w.r.t commits.

Contributing

Tensorflow I/O is a community led open source project. As such, the project depends on public contributions, bug-fixes, and documentation. Please see:

Build Status and CI

Build Status
Linux CPU Python 2 Status
Linux CPU Python 3 Status
Linux GPU Python 2 Status
Linux GPU Python 3 Status

Because of manylinux2010 requirement, TensorFlow I/O is built with Ubuntu:16.04 + Developer Toolset 7 (GCC 7.3) on Linux. Configuration with Ubuntu 16.04 with Developer Toolset 7 is not exactly straightforward. If the system have docker installed, then the following command will automatically build manylinux2010 compatible whl package:

#!/usr/bin/env bash

ls dist/*
for f in dist/*.whl; do
  docker run -i --rm -v $PWD:/v -w /v --net=host quay.io/pypa/manylinux2010_x86_64 bash -x -e /v/tools/build/auditwheel repair --plat manylinux2010_x86_64 $f
done
sudo chown -R $(id -nu):$(id -ng) .
ls wheelhouse/*

It takes some time to build, but once complete, there will be python 3.5, 3.6, 3.7 compatible whl packages available in wheelhouse directory.

On macOS, the same command could be used. However, the script expects python in shell and will only generate a whl package that matches the version of python in shell. If you want to build a whl package for a specific python then you have to alias this version of python to python in shell. See .github/workflows/build.yml Auditwheel step for instructions how to do that.

Note the above command is also the command we use when releasing packages for Linux and macOS.

TensorFlow I/O uses both GitHub Workflows and Google CI (Kokoro) for continuous integration. GitHub Workflows is used for macOS build and test. Kokoro is used for Linux build and test. Again, because of the manylinux2010 requirement, on Linux whl packages are always built with Ubuntu 16.04 + Developer Toolset 7. Tests are done on a variatiy of systems with different python3 versions to ensure a good coverage:

Python Ubuntu 18.04 Ubuntu 20.04 macOS + osx9 Windows-2019
2.7 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: N/A
3.7 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: :heavy_check_mark:
3.8 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: :heavy_check_mark:

TensorFlow I/O has integrations with many systems and cloud vendors such as Prometheus, Apache Kafka, Apache Ignite, Google Cloud PubSub, AWS Kinesis, Microsoft Azure Storage, Alibaba Cloud OSS etc.

We tried our best to test against those systems in our continuous integration whenever possible. Some tests such as Prometheus, Kafka, and Ignite are done with live systems, meaning we install Prometheus/Kafka/Ignite on CI machine before the test is run. Some tests such as Kinesis, PubSub, and Azure Storage are done through official or non-official emulators. Offline tests are also performed whenever possible, though systems covered through offine tests may not have the same level of coverage as live systems or emulators.

Live System Emulator CI Integration Offline
Apache Kafka :heavy_check_mark: :heavy_check_mark:
Apache Ignite :heavy_check_mark: :heavy_check_mark:
Prometheus :heavy_check_mark: :heavy_check_mark:
Google PubSub :heavy_check_mark: :heavy_check_mark:
Azure Storage :heavy_check_mark: :heavy_check_mark:
AWS Kinesis :heavy_check_mark: :heavy_check_mark:
Alibaba Cloud OSS :heavy_check_mark:
Google BigTable/BigQuery to be added
Elasticsearch (experimental) :heavy_check_mark: :heavy_check_mark:
MongoDB (experimental) :heavy_check_mark: :heavy_check_mark:

References for emulators:

Community

Additional Information

License

Apache License 2.0

Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

tensorflow_io_nightly-0.20.0.dev20210910015824-cp39-cp39-win_amd64.whl (21.4 MB view details)

Uploaded CPython 3.9 Windows x86-64

tensorflow_io_nightly-0.20.0.dev20210910015824-cp39-cp39-macosx_10_14_x86_64.whl (22.8 MB view details)

Uploaded CPython 3.9 macOS 10.14+ x86-64

tensorflow_io_nightly-0.20.0.dev20210910015824-cp38-cp38-win_amd64.whl (21.4 MB view details)

Uploaded CPython 3.8 Windows x86-64

tensorflow_io_nightly-0.20.0.dev20210910015824-cp38-cp38-macosx_10_14_x86_64.whl (22.8 MB view details)

Uploaded CPython 3.8 macOS 10.14+ x86-64

tensorflow_io_nightly-0.20.0.dev20210910015824-cp37-cp37m-win_amd64.whl (21.4 MB view details)

Uploaded CPython 3.7m Windows x86-64

tensorflow_io_nightly-0.20.0.dev20210910015824-cp37-cp37m-macosx_10_14_x86_64.whl (22.8 MB view details)

Uploaded CPython 3.7m macOS 10.14+ x86-64

tensorflow_io_nightly-0.20.0.dev20210910015824-cp36-cp36m-win_amd64.whl (21.4 MB view details)

Uploaded CPython 3.6m Windows x86-64

tensorflow_io_nightly-0.20.0.dev20210910015824-cp36-cp36m-macosx_10_14_x86_64.whl (22.8 MB view details)

Uploaded CPython 3.6m macOS 10.14+ x86-64

File details

Details for the file tensorflow_io_nightly-0.20.0.dev20210910015824-cp39-cp39-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.20.0.dev20210910015824-cp39-cp39-win_amd64.whl
Algorithm Hash digest
SHA256 51eb30286857728aa6bdc3ecaab25647c940ebcb6beca64dd4cf2b128802077e
MD5 ccc054fc3a8d9e3c99733b060988ae38
BLAKE2b-256 47905c1eda0ab2fffdd4066913bcbf33b826df38ec2e99a73e586b7f3db627b6

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.20.0.dev20210910015824-cp39-cp39-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.20.0.dev20210910015824-cp39-cp39-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 71e92faa539f66ade6379069d4fe2a7c740141fa1ae873f75cecbd156b48bd72
MD5 6345d00d0b9eb7b899ad859fd92b664f
BLAKE2b-256 a8f5b925182aec662e06ff5dc2dd162fd64666855d70f35bf269001d05e2341b

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.20.0.dev20210910015824-cp39-cp39-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.20.0.dev20210910015824-cp39-cp39-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 9cb6fe9928b78e52c1e34e419a6863328d67355ddf5eaa84029f3baba093bcd4
MD5 1ce8216e9fe593f8d4045c3d5555295f
BLAKE2b-256 012b24999205cb82f616fab4c447814e2ee15adb6fb67665c6922c5dd68fe9d1

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.20.0.dev20210910015824-cp38-cp38-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.20.0.dev20210910015824-cp38-cp38-win_amd64.whl
Algorithm Hash digest
SHA256 456384512b1844fe1848f019d75bf63f21a45c1cbe8c17e816ef00a4ef0d83a2
MD5 30c4313b76046513abf805cd1e633713
BLAKE2b-256 3b470571b028fb18f636775c9e06a481c8cf5d8127a45db021006716041bed44

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.20.0.dev20210910015824-cp38-cp38-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.20.0.dev20210910015824-cp38-cp38-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 c92c37c0f845674ee48b279d39e67fb803263d1bb8a2e8c65582fd37ceaa01c6
MD5 dc7002832b57d6ca9ba97205510e9638
BLAKE2b-256 b109bb20893079930763fb4e8e873c6e34da2e3d14d9f1c01b2f750c4064e60e

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.20.0.dev20210910015824-cp38-cp38-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.20.0.dev20210910015824-cp38-cp38-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 f610c82050901659e08115e134d45be7525b08cdd90df8c2e2cbf72f815e5b66
MD5 ee3c8e7325539b1d782d94a85625e919
BLAKE2b-256 d8e9121e2008e1f95f0ffedcee0385ceb253732558ee09d19df297a159ed4bfd

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.20.0.dev20210910015824-cp37-cp37m-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.20.0.dev20210910015824-cp37-cp37m-win_amd64.whl
Algorithm Hash digest
SHA256 f37f8a4f7a8914a437f4bba5f3675d7bb3f137861fe56a61f29eb53fe657e7a7
MD5 d479f5d059bb92c277d3cc1a9faece96
BLAKE2b-256 a78ff174c622aee79e1cfdac62dc9abcda4a3931ba23f6f10568826ff0f55e95

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.20.0.dev20210910015824-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.20.0.dev20210910015824-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 d738c9c607ee8ebb636618c915a18949ecb86e25b8134b0af1196500ffd1a560
MD5 5e23e4cd878819ec3d5e01b403189c56
BLAKE2b-256 4e4e8a6899da430bc7aa1a0874b38dca1ed509afacde01b8c0a925a5399da5af

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.20.0.dev20210910015824-cp37-cp37m-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.20.0.dev20210910015824-cp37-cp37m-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 dcc51ebace1c709c1355b8f2af7d7a4faca26edfd8fa59bdd7bddbb0a731b499
MD5 9288d27d28e2f8da5efe76811b5af5ed
BLAKE2b-256 4e058cd9391e0afe32396d4d49215af93327089c061b2cdee72f7b3093446cd5

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.20.0.dev20210910015824-cp36-cp36m-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.20.0.dev20210910015824-cp36-cp36m-win_amd64.whl
Algorithm Hash digest
SHA256 3e4dbb76d5804fb6858488f63abe4a8c49ebe1cfcbc12c85719acced8953daa7
MD5 9ab3e9cfa9aa29240ebd45fec9d4ed0d
BLAKE2b-256 02d6f512791c35a32434b506c2f60a62582e0015c88783df32ccd961412cfcb9

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.20.0.dev20210910015824-cp36-cp36m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.20.0.dev20210910015824-cp36-cp36m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 b2512c8b2dabfbccf13aa1076c5c1da4fe8e4bd38508a33c24730f46692e2392
MD5 c0c15439ef1d630d2a5cd9f239dcbdb9
BLAKE2b-256 7b6daf0dc410a61ed8eaf5669dea993ca56084b2206678d64e7f546398399045

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.20.0.dev20210910015824-cp36-cp36m-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.20.0.dev20210910015824-cp36-cp36m-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 eb7b1eeac308ba9655c94174742c04451804ee3ed0530a86b096e99a40824803
MD5 48a4019d01343d11801ab53b44f1279f
BLAKE2b-256 14a2fe1d18c5b6cf725fddfe29f30facc5f3bb21a2c5ddc7ad5a270bfda55eab

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page