Skip to main content

TensorFlow IO

Project description




TensorFlow I/O

GitHub CI PyPI License Documentation

TensorFlow I/O is a collection of file systems and file formats that are not available in TensorFlow's built-in support. A full list of supported file systems and file formats by TensorFlow I/O can be found here.

The use of tensorflow-io is straightforward with keras. Below is an example to Get Started with TensorFlow with the data processing aspect replaced by tensorflow-io:

import tensorflow as tf
import tensorflow_io as tfio

# Read the MNIST data into the IODataset.
dataset_url = "https://storage.googleapis.com/cvdf-datasets/mnist/"
d_train = tfio.IODataset.from_mnist(
    dataset_url + "train-images-idx3-ubyte.gz",
    dataset_url + "train-labels-idx1-ubyte.gz",
)

# Shuffle the elements of the dataset.
d_train = d_train.shuffle(buffer_size=1024)

# By default image data is uint8, so convert to float32 using map().
d_train = d_train.map(lambda x, y: (tf.image.convert_image_dtype(x, tf.float32), y))

# prepare batches the data just like any other tf.data.Dataset
d_train = d_train.batch(32)

# Build the model.
model = tf.keras.models.Sequential(
    [
        tf.keras.layers.Flatten(input_shape=(28, 28)),
        tf.keras.layers.Dense(512, activation=tf.nn.relu),
        tf.keras.layers.Dropout(0.2),
        tf.keras.layers.Dense(10, activation=tf.nn.softmax),
    ]
)

# Compile the model.
model.compile(
    optimizer="adam", loss="sparse_categorical_crossentropy", metrics=["accuracy"]
)

# Fit the model.
model.fit(d_train, epochs=5, steps_per_epoch=200)

In the above MNIST example, the URL's to access the dataset files are passed directly to the tfio.IODataset.from_mnist API call. This is due to the inherent support that tensorflow-io provides for HTTP/HTTPS file system, thus eliminating the need for downloading and saving datasets on a local directory.

NOTE: Since tensorflow-io is able to detect and uncompress the MNIST dataset automatically if needed, we can pass the URL's for the compressed files (gzip) to the API call as is.

Please check the official documentation for more detailed and interesting usages of the package.

Installation

Python Package

The tensorflow-io Python package can be installed with pip directly using:

$ pip install tensorflow-io

People who are a little more adventurous can also try our nightly binaries:

$ pip install tensorflow-io-nightly

To ensure you have a version of TensorFlow that is compatible with TensorFlow-IO, you can specify the tensorflow extra requirement during install:

pip install tensorflow-io[tensorflow]

Similar extras exist for the tensorflow-gpu, tensorflow-cpu and tensorflow-rocm packages.

Docker Images

In addition to the pip packages, the docker images can be used to quickly get started.

For stable builds:

$ docker pull tfsigio/tfio:latest
$ docker run -it --rm --name tfio-latest tfsigio/tfio:latest

For nightly builds:

$ docker pull tfsigio/tfio:nightly
$ docker run -it --rm --name tfio-nightly tfsigio/tfio:nightly

R Package

Once the tensorflow-io Python package has been successfully installed, you can install the development version of the R package from GitHub via the following:

if (!require("remotes")) install.packages("remotes")
remotes::install_github("tensorflow/io", subdir = "R-package")

TensorFlow Version Compatibility

To ensure compatibility with TensorFlow, it is recommended to install a matching version of TensorFlow I/O according to the table below. You can find the list of releases here.

TensorFlow I/O Version TensorFlow Compatibility Release Date
0.25.0 2.8.x Apr 19, 2022
0.24.0 2.8.x Feb 04, 2022
0.23.1 2.7.x Dec 15, 2021
0.23.0 2.7.x Dec 14, 2021
0.22.0 2.7.x Nov 10, 2021
0.21.0 2.6.x Sep 12, 2021
0.20.0 2.6.x Aug 11, 2021
0.19.1 2.5.x Jul 25, 2021
0.19.0 2.5.x Jun 25, 2021
0.18.0 2.5.x May 13, 2021
0.17.1 2.4.x Apr 16, 2021
0.17.0 2.4.x Dec 14, 2020
0.16.0 2.3.x Oct 23, 2020
0.15.0 2.3.x Aug 03, 2020
0.14.0 2.2.x Jul 08, 2020
0.13.0 2.2.x May 10, 2020
0.12.0 2.1.x Feb 28, 2020
0.11.0 2.1.x Jan 10, 2020
0.10.0 2.0.x Dec 05, 2019
0.9.1 2.0.x Nov 15, 2019
0.9.0 2.0.x Oct 18, 2019
0.8.1 1.15.x Nov 15, 2019
0.8.0 1.15.x Oct 17, 2019
0.7.2 1.14.x Nov 15, 2019
0.7.1 1.14.x Oct 18, 2019
0.7.0 1.14.x Jul 14, 2019
0.6.0 1.13.x May 29, 2019
0.5.0 1.13.x Apr 12, 2019
0.4.0 1.13.x Mar 01, 2019
0.3.0 1.12.0 Feb 15, 2019
0.2.0 1.12.0 Jan 29, 2019
0.1.0 1.12.0 Dec 16, 2018

Performance Benchmarking

We use github-pages to document the results of API performance benchmarks. The benchmark job is triggered on every commit to master branch and facilitates tracking performance w.r.t commits.

Contributing

Tensorflow I/O is a community led open source project. As such, the project depends on public contributions, bug-fixes, and documentation. Please see:

Build Status and CI

Build Status
Linux CPU Python 2 Status
Linux CPU Python 3 Status
Linux GPU Python 2 Status
Linux GPU Python 3 Status

Because of manylinux2010 requirement, TensorFlow I/O is built with Ubuntu:16.04 + Developer Toolset 7 (GCC 7.3) on Linux. Configuration with Ubuntu 16.04 with Developer Toolset 7 is not exactly straightforward. If the system have docker installed, then the following command will automatically build manylinux2010 compatible whl package:

#!/usr/bin/env bash

ls dist/*
for f in dist/*.whl; do
  docker run -i --rm -v $PWD:/v -w /v --net=host quay.io/pypa/manylinux2010_x86_64 bash -x -e /v/tools/build/auditwheel repair --plat manylinux2010_x86_64 $f
done
sudo chown -R $(id -nu):$(id -ng) .
ls wheelhouse/*

It takes some time to build, but once complete, there will be python 3.5, 3.6, 3.7 compatible whl packages available in wheelhouse directory.

On macOS, the same command could be used. However, the script expects python in shell and will only generate a whl package that matches the version of python in shell. If you want to build a whl package for a specific python then you have to alias this version of python to python in shell. See .github/workflows/build.yml Auditwheel step for instructions how to do that.

Note the above command is also the command we use when releasing packages for Linux and macOS.

TensorFlow I/O uses both GitHub Workflows and Google CI (Kokoro) for continuous integration. GitHub Workflows is used for macOS build and test. Kokoro is used for Linux build and test. Again, because of the manylinux2010 requirement, on Linux whl packages are always built with Ubuntu 16.04 + Developer Toolset 7. Tests are done on a variatiy of systems with different python3 versions to ensure a good coverage:

Python Ubuntu 18.04 Ubuntu 20.04 macOS + osx9 Windows-2019
2.7 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: N/A
3.7 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: :heavy_check_mark:
3.8 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: :heavy_check_mark:

TensorFlow I/O has integrations with many systems and cloud vendors such as Prometheus, Apache Kafka, Apache Ignite, Google Cloud PubSub, AWS Kinesis, Microsoft Azure Storage, Alibaba Cloud OSS etc.

We tried our best to test against those systems in our continuous integration whenever possible. Some tests such as Prometheus, Kafka, and Ignite are done with live systems, meaning we install Prometheus/Kafka/Ignite on CI machine before the test is run. Some tests such as Kinesis, PubSub, and Azure Storage are done through official or non-official emulators. Offline tests are also performed whenever possible, though systems covered through offine tests may not have the same level of coverage as live systems or emulators.

Live System Emulator CI Integration Offline
Apache Kafka :heavy_check_mark: :heavy_check_mark:
Apache Ignite :heavy_check_mark: :heavy_check_mark:
Prometheus :heavy_check_mark: :heavy_check_mark:
Google PubSub :heavy_check_mark: :heavy_check_mark:
Azure Storage :heavy_check_mark: :heavy_check_mark:
AWS Kinesis :heavy_check_mark: :heavy_check_mark:
Alibaba Cloud OSS :heavy_check_mark:
Google BigTable/BigQuery to be added
Elasticsearch (experimental) :heavy_check_mark: :heavy_check_mark:
MongoDB (experimental) :heavy_check_mark: :heavy_check_mark:

References for emulators:

Community

Additional Information

License

Apache License 2.0

Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

tensorflow_io_nightly-0.25.0.dev20220510023907-cp310-cp310-win_amd64.whl (21.9 MB view details)

Uploaded CPython 3.10 Windows x86-64

tensorflow_io_nightly-0.25.0.dev20220510023907-cp310-cp310-macosx_10_14_x86_64.whl (24.2 MB view details)

Uploaded CPython 3.10 macOS 10.14+ x86-64

tensorflow_io_nightly-0.25.0.dev20220510023907-cp39-cp39-win_amd64.whl (21.9 MB view details)

Uploaded CPython 3.9 Windows x86-64

tensorflow_io_nightly-0.25.0.dev20220510023907-cp39-cp39-macosx_10_14_x86_64.whl (24.2 MB view details)

Uploaded CPython 3.9 macOS 10.14+ x86-64

tensorflow_io_nightly-0.25.0.dev20220510023907-cp38-cp38-win_amd64.whl (21.9 MB view details)

Uploaded CPython 3.8 Windows x86-64

tensorflow_io_nightly-0.25.0.dev20220510023907-cp38-cp38-macosx_10_14_x86_64.whl (24.2 MB view details)

Uploaded CPython 3.8 macOS 10.14+ x86-64

tensorflow_io_nightly-0.25.0.dev20220510023907-cp37-cp37m-win_amd64.whl (21.9 MB view details)

Uploaded CPython 3.7m Windows x86-64

tensorflow_io_nightly-0.25.0.dev20220510023907-cp37-cp37m-macosx_10_14_x86_64.whl (24.2 MB view details)

Uploaded CPython 3.7m macOS 10.14+ x86-64

File details

Details for the file tensorflow_io_nightly-0.25.0.dev20220510023907-cp310-cp310-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.25.0.dev20220510023907-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 c3b87d243aa79fd5fbb81ae2bb6da5765dabe3d96d0838c0e2988a390e29c867
MD5 f1eb1d658d2780310486288a40f0a676
BLAKE2b-256 7814e74b2a790c606beaaf0f8c0491e1d40ab8415f93b31236614e676a5e024e

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.25.0.dev20220510023907-cp310-cp310-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.25.0.dev20220510023907-cp310-cp310-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 d08d11c087ee6c74c9337a04af333a6079e0dfcbfaf47e2573c5fde69dd629e5
MD5 e6231bb6cd1f71052a40cedefe3f7e76
BLAKE2b-256 76caa4b02543ae0e1f6a11fbb3213f2b2a9dbe0baf7c320f367e52137486b9c0

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.25.0.dev20220510023907-cp310-cp310-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.25.0.dev20220510023907-cp310-cp310-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 e87bed2444f38e67a2f4afc49251d8a14fac61807afb6fc5d718270946d310e4
MD5 45db0176c25c9c142d7b2701b75795da
BLAKE2b-256 49a6122642f6e65dddcbc5e21cb00cf0ebc2dcb2387fb69fe34a89a56aad8922

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.25.0.dev20220510023907-cp39-cp39-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.25.0.dev20220510023907-cp39-cp39-win_amd64.whl
Algorithm Hash digest
SHA256 268f44c3d99f9001f53873bfccd54366d198d28e4a9502d2036a28d1f74a8ea5
MD5 0b4a7dcbc60ed35fce00d708acfe5f3c
BLAKE2b-256 a5d5c2f76bf0bc9c108752565eed60f88a338e662c328bed35da323a324132ba

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.25.0.dev20220510023907-cp39-cp39-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.25.0.dev20220510023907-cp39-cp39-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 9a840385813ca1f49e9ef97011b32bb89043beff4b9838db4fcacf920730ac40
MD5 fcf82ce7ad8eaec9eb52534f1ddbe09d
BLAKE2b-256 4e6f6f2faa9af99bea1ec199ba740f2b3dba31eb0dfd66ec4126d6e2f9060ba3

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.25.0.dev20220510023907-cp39-cp39-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.25.0.dev20220510023907-cp39-cp39-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 2e37bece815e6a1d5f56b662719114fe6e0417187064861ee6a4f72a0d7de3c9
MD5 f5d6a17c8e0a7bd9e30422d976b7daf9
BLAKE2b-256 25786d029886cee3d1cfe57d5367f94bcea387df8406e72a8b486c5f571e6b00

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.25.0.dev20220510023907-cp38-cp38-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.25.0.dev20220510023907-cp38-cp38-win_amd64.whl
Algorithm Hash digest
SHA256 1122eeadea23ffe08379a25ce962c31a7ff241cf57b27bc4843ae6bc9b74f097
MD5 46173934da48e8c2ec23138c06818e2b
BLAKE2b-256 cd67c2735cea38acf957e57145942f9255877702ca6ae27344e1d1a13363f741

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.25.0.dev20220510023907-cp38-cp38-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.25.0.dev20220510023907-cp38-cp38-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 60671daab35c91d9f41bc9560e7c852a9696fcec526ba23279aee7ddee925b83
MD5 57e275cd69fb6ef931088cddbfc5e65b
BLAKE2b-256 ae9f1f67f2043a1c24d302c2437962036f4f3c8d43722da79dfb53eb92dc4dcf

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.25.0.dev20220510023907-cp38-cp38-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.25.0.dev20220510023907-cp38-cp38-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 9083d63fe9869966cd6488f80da107e6b293b31c5e044e4a5357d8e98f789a25
MD5 2b5660a70f02ea82d2bb7ce8c1ba6155
BLAKE2b-256 3e5cc7ad5a04581f653b6144512c4c4ac3574de0fa8c567d0bf30d01bdbc7574

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.25.0.dev20220510023907-cp37-cp37m-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.25.0.dev20220510023907-cp37-cp37m-win_amd64.whl
Algorithm Hash digest
SHA256 2459eef6441e487a8c3b37d8687b003cf5498579037e0343ad9943e41aab9930
MD5 ae36ad4182b73eb25bfc03c41e0aec83
BLAKE2b-256 1d9ea80cf7550f963fd5c8f4574eb638a450b1f556553ce44231a53cc4be1adc

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.25.0.dev20220510023907-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.25.0.dev20220510023907-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 b391d9083c69c9c2dd7322f99b1334d9d89def25a9c9bbb8183605a7c480632c
MD5 2abb69e5c875bb7a68fa2fc7e19c8215
BLAKE2b-256 ad0657376df629f03cb63cb5b886a267e57719149c29d4f691d11766943b03bb

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.25.0.dev20220510023907-cp37-cp37m-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.25.0.dev20220510023907-cp37-cp37m-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 7604bda63122412b876b81690806b52b3eaa8dfa1702e2c921750f7179f7dde2
MD5 25e700bbcd86b2f75ac4d69ef78876df
BLAKE2b-256 098854f4e2b5df8cb5b6e2bae8de3c048d5f3edd440d4c4a6b6faf9e2779b289

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page