Skip to main content

TensorFlow IO

Project description




TensorFlow I/O

GitHub CI PyPI License Documentation

TensorFlow I/O is a collection of file systems and file formats that are not available in TensorFlow's built-in support. A full list of supported file systems and file formats by TensorFlow I/O can be found here.

The use of tensorflow-io is straightforward with keras. Below is an example to Get Started with TensorFlow with the data processing aspect replaced by tensorflow-io:

import tensorflow as tf
import tensorflow_io as tfio

# Read the MNIST data into the IODataset.
dataset_url = "https://storage.googleapis.com/cvdf-datasets/mnist/"
d_train = tfio.IODataset.from_mnist(
    dataset_url + "train-images-idx3-ubyte.gz",
    dataset_url + "train-labels-idx1-ubyte.gz",
)

# Shuffle the elements of the dataset.
d_train = d_train.shuffle(buffer_size=1024)

# By default image data is uint8, so convert to float32 using map().
d_train = d_train.map(lambda x, y: (tf.image.convert_image_dtype(x, tf.float32), y))

# prepare batches the data just like any other tf.data.Dataset
d_train = d_train.batch(32)

# Build the model.
model = tf.keras.models.Sequential(
    [
        tf.keras.layers.Flatten(input_shape=(28, 28)),
        tf.keras.layers.Dense(512, activation=tf.nn.relu),
        tf.keras.layers.Dropout(0.2),
        tf.keras.layers.Dense(10, activation=tf.nn.softmax),
    ]
)

# Compile the model.
model.compile(
    optimizer="adam", loss="sparse_categorical_crossentropy", metrics=["accuracy"]
)

# Fit the model.
model.fit(d_train, epochs=5, steps_per_epoch=200)

In the above MNIST example, the URL's to access the dataset files are passed directly to the tfio.IODataset.from_mnist API call. This is due to the inherent support that tensorflow-io provides for HTTP/HTTPS file system, thus eliminating the need for downloading and saving datasets on a local directory.

NOTE: Since tensorflow-io is able to detect and uncompress the MNIST dataset automatically if needed, we can pass the URL's for the compressed files (gzip) to the API call as is.

Please check the official documentation for more detailed and interesting usages of the package.

Installation

Python Package

The tensorflow-io Python package can be installed with pip directly using:

$ pip install tensorflow-io

People who are a little more adventurous can also try our nightly binaries:

$ pip install tensorflow-io-nightly

Docker Images

In addition to the pip packages, the docker images can be used to quickly get started.

For stable builds:

$ docker pull tfsigio/tfio:latest
$ docker run -it --rm --name tfio-latest tfsigio/tfio:latest

For nightly builds:

$ docker pull tfsigio/tfio:nightly
$ docker run -it --rm --name tfio-nightly tfsigio/tfio:nightly

R Package

Once the tensorflow-io Python package has been successfully installed, you can install the development version of the R package from GitHub via the following:

if (!require("remotes")) install.packages("remotes")
remotes::install_github("tensorflow/io", subdir = "R-package")

TensorFlow Version Compatibility

To ensure compatibility with TensorFlow, it is recommended to install a matching version of TensorFlow I/O according to the table below. You can find the list of releases here.

TensorFlow I/O Version TensorFlow Compatibility Release Date
0.17.1 2.4.x Apr 16, 2021
0.17.0 2.4.x Dec 14, 2020
0.16.0 2.3.x Oct 23, 2020
0.15.0 2.3.x Aug 03, 2020
0.14.0 2.2.x Jul 08, 2020
0.13.0 2.2.x May 10, 2020
0.12.0 2.1.x Feb 28, 2020
0.11.0 2.1.x Jan 10, 2020
0.10.0 2.0.x Dec 05, 2019
0.9.1 2.0.x Nov 15, 2019
0.9.0 2.0.x Oct 18, 2019
0.8.1 1.15.x Nov 15, 2019
0.8.0 1.15.x Oct 17, 2019
0.7.2 1.14.x Nov 15, 2019
0.7.1 1.14.x Oct 18, 2019
0.7.0 1.14.x Jul 14, 2019
0.6.0 1.13.x May 29, 2019
0.5.0 1.13.x Apr 12, 2019
0.4.0 1.13.x Mar 01, 2019
0.3.0 1.12.0 Feb 15, 2019
0.2.0 1.12.0 Jan 29, 2019
0.1.0 1.12.0 Dec 16, 2018

Performance Benchmarking

We use github-pages to document the results of API performance benchmarks. The benchmark job is triggered on every commit to master branch and facilitates tracking performance w.r.t commits.

Contributing

Tensorflow I/O is a community led open source project. As such, the project depends on public contributions, bug-fixes, and documentation. Please see:

Build Status and CI

Build Status
Linux CPU Python 2 Status
Linux CPU Python 3 Status
Linux GPU Python 2 Status
Linux GPU Python 3 Status

Because of manylinux2010 requirement, TensorFlow I/O is built with Ubuntu:16.04 + Developer Toolset 7 (GCC 7.3) on Linux. Configuration with Ubuntu 16.04 with Developer Toolset 7 is not exactly straightforward. If the system have docker installed, then the following command will automatically build manylinux2010 compatible whl package:

#!/usr/bin/env bash

ls dist/*
for f in dist/*.whl; do
  docker run -i --rm -v $PWD:/v -w /v --net=host quay.io/pypa/manylinux2010_x86_64 bash -x -e /v/tools/build/auditwheel repair --plat manylinux2010_x86_64 $f
done
sudo chown -R $(id -nu):$(id -ng) .
ls wheelhouse/*

It takes some time to build, but once complete, there will be python 3.5, 3.6, 3.7 compatible whl packages available in wheelhouse directory.

On macOS, the same command could be used. However, the script expects python in shell and will only generate a whl package that matches the version of python in shell. If you want to build a whl package for a specific python then you have to alias this version of python to python in shell. See .github/workflows/build.yml Auditwheel step for instructions how to do that.

Note the above command is also the command we use when releasing packages for Linux and macOS.

TensorFlow I/O uses both GitHub Workflows and Google CI (Kokoro) for continuous integration. GitHub Workflows is used for macOS build and test. Kokoro is used for Linux build and test. Again, because of the manylinux2010 requirement, on Linux whl packages are always built with Ubuntu 16.04 + Developer Toolset 7. Tests are done on a variatiy of systems with different python3 versions to ensure a good coverage:

Python Ubuntu 18.04 Ubuntu 20.04 macOS + osx9 Windows-2019
2.7 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: N/A
3.7 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: :heavy_check_mark:
3.8 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: :heavy_check_mark:

TensorFlow I/O has integrations with many systems and cloud vendors such as Prometheus, Apache Kafka, Apache Ignite, Google Cloud PubSub, AWS Kinesis, Microsoft Azure Storage, Alibaba Cloud OSS etc.

We tried our best to test against those systems in our continuous integration whenever possible. Some tests such as Prometheus, Kafka, and Ignite are done with live systems, meaning we install Prometheus/Kafka/Ignite on CI machine before the test is run. Some tests such as Kinesis, PubSub, and Azure Storage are done through official or non-official emulators. Offline tests are also performed whenever possible, though systems covered through offine tests may not have the same level of coverage as live systems or emulators.

Live System Emulator CI Integration Offline
Apache Kafka :heavy_check_mark: :heavy_check_mark:
Apache Ignite :heavy_check_mark: :heavy_check_mark:
Prometheus :heavy_check_mark: :heavy_check_mark:
Google PubSub :heavy_check_mark: :heavy_check_mark:
Azure Storage :heavy_check_mark: :heavy_check_mark:
AWS Kinesis :heavy_check_mark: :heavy_check_mark:
Alibaba Cloud OSS :heavy_check_mark:
Google BigTable/BigQuery to be added
Elasticsearch (experimental) :heavy_check_mark: :heavy_check_mark:
MongoDB (experimental) :heavy_check_mark: :heavy_check_mark:

References for emulators:

Community

Additional Information

License

Apache License 2.0

Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

tensorflow_io_nightly-0.18.0.dev20210506020114-cp39-cp39-win_amd64.whl (20.6 MB view details)

Uploaded CPython 3.9 Windows x86-64

tensorflow_io_nightly-0.18.0.dev20210506020114-cp39-cp39-macosx_10_14_x86_64.whl (21.1 MB view details)

Uploaded CPython 3.9 macOS 10.14+ x86-64

tensorflow_io_nightly-0.18.0.dev20210506020114-cp38-cp38-win_amd64.whl (20.6 MB view details)

Uploaded CPython 3.8 Windows x86-64

tensorflow_io_nightly-0.18.0.dev20210506020114-cp38-cp38-macosx_10_14_x86_64.whl (21.1 MB view details)

Uploaded CPython 3.8 macOS 10.14+ x86-64

tensorflow_io_nightly-0.18.0.dev20210506020114-cp37-cp37m-win_amd64.whl (20.6 MB view details)

Uploaded CPython 3.7m Windows x86-64

tensorflow_io_nightly-0.18.0.dev20210506020114-cp37-cp37m-macosx_10_14_x86_64.whl (21.1 MB view details)

Uploaded CPython 3.7m macOS 10.14+ x86-64

tensorflow_io_nightly-0.18.0.dev20210506020114-cp36-cp36m-win_amd64.whl (20.6 MB view details)

Uploaded CPython 3.6m Windows x86-64

tensorflow_io_nightly-0.18.0.dev20210506020114-cp36-cp36m-macosx_10_14_x86_64.whl (21.1 MB view details)

Uploaded CPython 3.6m macOS 10.14+ x86-64

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210506020114-cp39-cp39-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210506020114-cp39-cp39-win_amd64.whl
Algorithm Hash digest
SHA256 377d9097fb67ed68edc86772f63510bfe7e98a1a12afe9af424680931b407ea0
MD5 507a930e0ea42b37f6a592e926b8f1b4
BLAKE2b-256 a7b50297f3290428349de395771f4925f1a9d5ec7b8d0f11c42fd8191fe304a6

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210506020114-cp39-cp39-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210506020114-cp39-cp39-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 3181b89a8bc146df55e718a02fdfa0836fd476df3cace603d8329c5850d7481c
MD5 a953638af08a4c403a6594fe00240f42
BLAKE2b-256 0b5cfb2bc1af05436320ee61fc2b283e7143c91c5c8de9f8448e2fe70a461f07

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210506020114-cp39-cp39-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210506020114-cp39-cp39-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 f874562c9acc8aa1212a1aa2b85689c0a74b2f3e151d7be03b8d94d263718bb1
MD5 6dc35d337be5a2974782a3d66c9d510e
BLAKE2b-256 f17d584a1e9d98384f008489aca8beb5fde299e70714aa9a66f126a3c6c7b31f

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210506020114-cp38-cp38-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210506020114-cp38-cp38-win_amd64.whl
Algorithm Hash digest
SHA256 e529805ee5167a7ed94099a35645a53d11793fefd185f59fdaca11fea6b11894
MD5 f9dfd21ad14336cbf0dfd1fbb3ea85c5
BLAKE2b-256 3572ce99c0dc1ef04e04d98ef6f48f47594698ccefc3de2365db57052e94aeca

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210506020114-cp38-cp38-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210506020114-cp38-cp38-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 bb9d7543511b3e92f2cfbb0bc05a4062c50e9d808da5fe689eef9ffa05e56367
MD5 66d20efafae250cc7f1914b9b993f522
BLAKE2b-256 41802de36d8c46defba18f27c560775c1418fdf8a2829fd8603e55c881b92f18

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210506020114-cp38-cp38-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210506020114-cp38-cp38-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 3d838b80e5cd0a757c5d357c8a9fb71c3941d1bdbffa54487c3658225ee5d4b0
MD5 5035f6aed9de9c71751bb49f42462b08
BLAKE2b-256 fb554e7f4b8901c4c918526399c04912e5d9c7285ba497d2cfa29ddce348fca1

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210506020114-cp37-cp37m-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210506020114-cp37-cp37m-win_amd64.whl
Algorithm Hash digest
SHA256 16b9e3ec540e02214c1deee4e1f030015c069990dbec70b66521c8936070b183
MD5 98cae97b208f7f1e35606cacdda55d26
BLAKE2b-256 97a0b1a6003f56a0de934f9ce0d874c807852bf9fe0d30f716828dae16faf040

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210506020114-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210506020114-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 6a874c57b38f1adf5693429ae556733a5831909007e67ddceee834fd98e373b4
MD5 85269e543961767adeb18354ee03d37c
BLAKE2b-256 e9e3298b70d8ffb006eb0b584013f87276bb65488ad03a7bac3e6422ac5124d3

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210506020114-cp37-cp37m-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210506020114-cp37-cp37m-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 1b17aeb0a69a63a3bdee6c950ed05802cb1816a1ae7a51cb847298e6e470b276
MD5 79c6788fbf680e469cc97b1e89c804cd
BLAKE2b-256 40b030ad1669df421c25b160880c23f71c93039b6223cfc4f1cdcddcabf028e1

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210506020114-cp36-cp36m-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210506020114-cp36-cp36m-win_amd64.whl
Algorithm Hash digest
SHA256 694cb40bfaafe7d18f30664f6d2e0da15f412619fcb1edae7c658c8204a67a62
MD5 5d94fd7d32db0097252d15e48286a49d
BLAKE2b-256 7c6d7da207ed94b7dae229432f34165d305a47bf606614f133b6b63a8695bb65

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210506020114-cp36-cp36m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210506020114-cp36-cp36m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 9c7c8272b7444fb2898372618791d941b8385a8b276e53df2a4bbd595bf8c853
MD5 4139c1f48ad7a56bf84c1f2435b59896
BLAKE2b-256 a7ad07b85b3b17ecc313421928a7e18138837be2ed29243f9f780965756de15f

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210506020114-cp36-cp36m-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210506020114-cp36-cp36m-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 f2d8822653bbd6264958c202db1e29b051a032a1bf42d12ef69998a29f0426c4
MD5 95369e02b9925a70de0069a1f078cf1c
BLAKE2b-256 56bc2e7231d48c7e29a6e7cfa1b071008261d8f68d7cd35d03ac2cb46583121c

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page