Skip to main content

TensorFlow IO

Project description




TensorFlow I/O

GitHub CI PyPI License Documentation

TensorFlow I/O is a collection of file systems and file formats that are not available in TensorFlow's built-in support. A full list of supported file systems and file formats by TensorFlow I/O can be found here.

The use of tensorflow-io is straightforward with keras. Below is an example to Get Started with TensorFlow with the data processing aspect replaced by tensorflow-io:

import tensorflow as tf
import tensorflow_io as tfio

# Read the MNIST data into the IODataset.
dataset_url = "https://storage.googleapis.com/cvdf-datasets/mnist/"
d_train = tfio.IODataset.from_mnist(
    dataset_url + "train-images-idx3-ubyte.gz",
    dataset_url + "train-labels-idx1-ubyte.gz",
)

# Shuffle the elements of the dataset.
d_train = d_train.shuffle(buffer_size=1024)

# By default image data is uint8, so convert to float32 using map().
d_train = d_train.map(lambda x, y: (tf.image.convert_image_dtype(x, tf.float32), y))

# prepare batches the data just like any other tf.data.Dataset
d_train = d_train.batch(32)

# Build the model.
model = tf.keras.models.Sequential(
    [
        tf.keras.layers.Flatten(input_shape=(28, 28)),
        tf.keras.layers.Dense(512, activation=tf.nn.relu),
        tf.keras.layers.Dropout(0.2),
        tf.keras.layers.Dense(10, activation=tf.nn.softmax),
    ]
)

# Compile the model.
model.compile(
    optimizer="adam", loss="sparse_categorical_crossentropy", metrics=["accuracy"]
)

# Fit the model.
model.fit(d_train, epochs=5, steps_per_epoch=200)

In the above MNIST example, the URL's to access the dataset files are passed directly to the tfio.IODataset.from_mnist API call. This is due to the inherent support that tensorflow-io provides for HTTP/HTTPS file system, thus eliminating the need for downloading and saving datasets on a local directory.

NOTE: Since tensorflow-io is able to detect and uncompress the MNIST dataset automatically if needed, we can pass the URL's for the compressed files (gzip) to the API call as is.

Please check the official documentation for more detailed and interesting usages of the package.

Installation

Python Package

The tensorflow-io Python package can be installed with pip directly using:

$ pip install tensorflow-io

People who are a little more adventurous can also try our nightly binaries:

$ pip install tensorflow-io-nightly

Docker Images

In addition to the pip packages, the docker images can be used to quickly get started.

For stable builds:

$ docker pull tfsigio/tfio:latest
$ docker run -it --rm --name tfio-latest tfsigio/tfio:latest

For nightly builds:

$ docker pull tfsigio/tfio:nightly
$ docker run -it --rm --name tfio-nightly tfsigio/tfio:nightly

R Package

Once the tensorflow-io Python package has been successfully installed, you can install the development version of the R package from GitHub via the following:

if (!require("remotes")) install.packages("remotes")
remotes::install_github("tensorflow/io", subdir = "R-package")

TensorFlow Version Compatibility

To ensure compatibility with TensorFlow, it is recommended to install a matching version of TensorFlow I/O according to the table below. You can find the list of releases here.

TensorFlow I/O Version TensorFlow Compatibility Release Date
0.17.1 2.4.x Apr 16, 2021
0.17.0 2.4.x Dec 14, 2020
0.16.0 2.3.x Oct 23, 2020
0.15.0 2.3.x Aug 03, 2020
0.14.0 2.2.x Jul 08, 2020
0.13.0 2.2.x May 10, 2020
0.12.0 2.1.x Feb 28, 2020
0.11.0 2.1.x Jan 10, 2020
0.10.0 2.0.x Dec 05, 2019
0.9.1 2.0.x Nov 15, 2019
0.9.0 2.0.x Oct 18, 2019
0.8.1 1.15.x Nov 15, 2019
0.8.0 1.15.x Oct 17, 2019
0.7.2 1.14.x Nov 15, 2019
0.7.1 1.14.x Oct 18, 2019
0.7.0 1.14.x Jul 14, 2019
0.6.0 1.13.x May 29, 2019
0.5.0 1.13.x Apr 12, 2019
0.4.0 1.13.x Mar 01, 2019
0.3.0 1.12.0 Feb 15, 2019
0.2.0 1.12.0 Jan 29, 2019
0.1.0 1.12.0 Dec 16, 2018

Performance Benchmarking

We use github-pages to document the results of API performance benchmarks. The benchmark job is triggered on every commit to master branch and facilitates tracking performance w.r.t commits.

Contributing

Tensorflow I/O is a community led open source project. As such, the project depends on public contributions, bug-fixes, and documentation. Please see:

Build Status and CI

Build Status
Linux CPU Python 2 Status
Linux CPU Python 3 Status
Linux GPU Python 2 Status
Linux GPU Python 3 Status

Because of manylinux2010 requirement, TensorFlow I/O is built with Ubuntu:16.04 + Developer Toolset 7 (GCC 7.3) on Linux. Configuration with Ubuntu 16.04 with Developer Toolset 7 is not exactly straightforward. If the system have docker installed, then the following command will automatically build manylinux2010 compatible whl package:

#!/usr/bin/env bash

ls dist/*
for f in dist/*.whl; do
  docker run -i --rm -v $PWD:/v -w /v --net=host quay.io/pypa/manylinux2010_x86_64 bash -x -e /v/tools/build/auditwheel repair --plat manylinux2010_x86_64 $f
done
sudo chown -R $(id -nu):$(id -ng) .
ls wheelhouse/*

It takes some time to build, but once complete, there will be python 3.5, 3.6, 3.7 compatible whl packages available in wheelhouse directory.

On macOS, the same command could be used. However, the script expects python in shell and will only generate a whl package that matches the version of python in shell. If you want to build a whl package for a specific python then you have to alias this version of python to python in shell. See .github/workflows/build.yml Auditwheel step for instructions how to do that.

Note the above command is also the command we use when releasing packages for Linux and macOS.

TensorFlow I/O uses both GitHub Workflows and Google CI (Kokoro) for continuous integration. GitHub Workflows is used for macOS build and test. Kokoro is used for Linux build and test. Again, because of the manylinux2010 requirement, on Linux whl packages are always built with Ubuntu 16.04 + Developer Toolset 7. Tests are done on a variatiy of systems with different python3 versions to ensure a good coverage:

Python Ubuntu 18.04 Ubuntu 20.04 macOS + osx9 Windows-2019
2.7 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: N/A
3.7 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: :heavy_check_mark:
3.8 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: :heavy_check_mark:

TensorFlow I/O has integrations with many systems and cloud vendors such as Prometheus, Apache Kafka, Apache Ignite, Google Cloud PubSub, AWS Kinesis, Microsoft Azure Storage, Alibaba Cloud OSS etc.

We tried our best to test against those systems in our continuous integration whenever possible. Some tests such as Prometheus, Kafka, and Ignite are done with live systems, meaning we install Prometheus/Kafka/Ignite on CI machine before the test is run. Some tests such as Kinesis, PubSub, and Azure Storage are done through official or non-official emulators. Offline tests are also performed whenever possible, though systems covered through offine tests may not have the same level of coverage as live systems or emulators.

Live System Emulator CI Integration Offline
Apache Kafka :heavy_check_mark: :heavy_check_mark:
Apache Ignite :heavy_check_mark: :heavy_check_mark:
Prometheus :heavy_check_mark: :heavy_check_mark:
Google PubSub :heavy_check_mark: :heavy_check_mark:
Azure Storage :heavy_check_mark: :heavy_check_mark:
AWS Kinesis :heavy_check_mark: :heavy_check_mark:
Alibaba Cloud OSS :heavy_check_mark:
Google BigTable/BigQuery to be added
Elasticsearch (experimental) :heavy_check_mark: :heavy_check_mark:
MongoDB (experimental) :heavy_check_mark: :heavy_check_mark:

References for emulators:

Community

Additional Information

License

Apache License 2.0

Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

tensorflow_io_nightly-0.18.0.dev20210502094850-cp39-cp39-win_amd64.whl (20.6 MB view details)

Uploaded CPython 3.9 Windows x86-64

tensorflow_io_nightly-0.18.0.dev20210502094850-cp39-cp39-manylinux2010_x86_64.whl (24.0 MB view details)

Uploaded CPython 3.9 manylinux: glibc 2.12+ x86-64

tensorflow_io_nightly-0.18.0.dev20210502094850-cp39-cp39-macosx_10_14_x86_64.whl (21.0 MB view details)

Uploaded CPython 3.9 macOS 10.14+ x86-64

tensorflow_io_nightly-0.18.0.dev20210502094850-cp38-cp38-win_amd64.whl (20.6 MB view details)

Uploaded CPython 3.8 Windows x86-64

tensorflow_io_nightly-0.18.0.dev20210502094850-cp38-cp38-manylinux2010_x86_64.whl (24.0 MB view details)

Uploaded CPython 3.8 manylinux: glibc 2.12+ x86-64

tensorflow_io_nightly-0.18.0.dev20210502094850-cp38-cp38-macosx_10_14_x86_64.whl (21.0 MB view details)

Uploaded CPython 3.8 macOS 10.14+ x86-64

tensorflow_io_nightly-0.18.0.dev20210502094850-cp37-cp37m-win_amd64.whl (20.6 MB view details)

Uploaded CPython 3.7m Windows x86-64

tensorflow_io_nightly-0.18.0.dev20210502094850-cp37-cp37m-manylinux2010_x86_64.whl (24.0 MB view details)

Uploaded CPython 3.7m manylinux: glibc 2.12+ x86-64

tensorflow_io_nightly-0.18.0.dev20210502094850-cp37-cp37m-macosx_10_14_x86_64.whl (21.0 MB view details)

Uploaded CPython 3.7m macOS 10.14+ x86-64

tensorflow_io_nightly-0.18.0.dev20210502094850-cp36-cp36m-win_amd64.whl (20.6 MB view details)

Uploaded CPython 3.6m Windows x86-64

tensorflow_io_nightly-0.18.0.dev20210502094850-cp36-cp36m-manylinux2010_x86_64.whl (24.0 MB view details)

Uploaded CPython 3.6m manylinux: glibc 2.12+ x86-64

tensorflow_io_nightly-0.18.0.dev20210502094850-cp36-cp36m-macosx_10_14_x86_64.whl (21.0 MB view details)

Uploaded CPython 3.6m macOS 10.14+ x86-64

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210502094850-cp39-cp39-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210502094850-cp39-cp39-win_amd64.whl
Algorithm Hash digest
SHA256 9891c4cf9d9445b356a102c365a3ff1f0afda490e9c0d7048b6ddb96ebfff1f5
MD5 ef1d4b1327933be2b0dbae5f00f2bb74
BLAKE2b-256 e23ffd6b63b29fa78dac5b80134e68b409098ae05666da4067f7b89640b1ec1f

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210502094850-cp39-cp39-manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210502094850-cp39-cp39-manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 6e60f9e55437b1aa7508bd0d5d9446c66a6b5656100f11293abfe29298514462
MD5 39f17fc3146c395c340d7e15e6e021ea
BLAKE2b-256 bcb60e246920bcf55d504f05268a567e8f9b5d5e5a83bc40784ec321635b63ff

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210502094850-cp39-cp39-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210502094850-cp39-cp39-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 3141240d2e5e3712bbbd0d77a227a321822ecbd62f4525389dc8dfad10b83157
MD5 cbb5d18b3489fdf9ca4b8a6db60ce483
BLAKE2b-256 f2eb246b26e3f3405f8d58ea3fce09f107b5620ca325e84bf74043387c8177c5

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210502094850-cp38-cp38-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210502094850-cp38-cp38-win_amd64.whl
Algorithm Hash digest
SHA256 0207ea76db0d3ebac6f2518f35ced165a9f28e78a340ba4f322556b18100bb40
MD5 d650d67f001be48a320fcbf30d8b8a18
BLAKE2b-256 7744d103ddfaac20a26a7272426d4dae23aea1d6c148d7ea2ec527281798feab

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210502094850-cp38-cp38-manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210502094850-cp38-cp38-manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 4b78189d319fc3810179fb70248681922cbd0dca572c44a6d146ee718c9fe876
MD5 234f291f366daf19cc530175c3f54997
BLAKE2b-256 260fe2541d889d7def0f0cd30c0b4bc1f91b06c172300b76f029bb92d8c3bcf6

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210502094850-cp38-cp38-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210502094850-cp38-cp38-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 8fba83a205218e656610ac5687e1be2300654d1082e482b44e4f4c9d0d29c2a5
MD5 80b8b86230c6f4424fc91250e1ec69c1
BLAKE2b-256 ecbe3c6abeee208e2b5cf7910d7c1b9efe0124e9856f9ce066406317844b90b6

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210502094850-cp37-cp37m-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210502094850-cp37-cp37m-win_amd64.whl
Algorithm Hash digest
SHA256 95e6125e4fce58d9b0f463bd39c7f6c53ca4a2d945faada05ea43d0d297a9a38
MD5 cbdf494ac3a604b22a8b6e37262d5848
BLAKE2b-256 d9a96c91c10d263f9ec342e67545be58369f5c2e4fde1de3d308965e37866037

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210502094850-cp37-cp37m-manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210502094850-cp37-cp37m-manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 820586e1dead176b46afc5642740262af591da840091c925215f913b31a38e63
MD5 308571a4a35d24d33567a8eb9e822135
BLAKE2b-256 70d8b8a1df2f3749ab1183e04ca9ab3f8d53af51040fd94e3f716ffe9c0a578e

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210502094850-cp37-cp37m-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210502094850-cp37-cp37m-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 b8b7cd3ab1e96c309cdeed1cd8a8dd744f207cfc6241bfba7ea1a466baf077f2
MD5 e128f8be8065e7259415f106f8ee8f9e
BLAKE2b-256 67b15e74f7b7dd8eba1d306b369c32dc05c60bf62dd00be799e11f6e2d766bff

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210502094850-cp36-cp36m-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210502094850-cp36-cp36m-win_amd64.whl
Algorithm Hash digest
SHA256 be7554efeb9c5f4be2224efd42dc6efbcb52627ec1ca6f5c3c382a2ea95dd1dd
MD5 2ccb96a39dd6f3979284a5ec801f894f
BLAKE2b-256 2ccd4b6c3e6d8680e94f96ccde66652fb65db51a23790302a25fc1bdb5598de6

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210502094850-cp36-cp36m-manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210502094850-cp36-cp36m-manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 f021d478092593e48839aded68c63c2624d0dc9b6d85377098dd2bc8aef60f1f
MD5 6ea771e2c9584b3d169d78b452d20c96
BLAKE2b-256 30e9f3cb04e5d615583419865bd31d151bddc1c3d27231790ff8388f6fe6ac7e

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210502094850-cp36-cp36m-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210502094850-cp36-cp36m-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 eae246767096f8515c109c758d7990216d4bae7ea20101c2b861c6ccc05605d1
MD5 b87ed3c7f261e879e3e749d19981a513
BLAKE2b-256 64dead06f9af79676a22ea07211611dd829cfb3ea9307994f417eca6a94748e9

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page