Skip to main content

TensorFlow IO

Project description




TensorFlow I/O

GitHub CI PyPI License Documentation

TensorFlow I/O is a collection of file systems and file formats that are not available in TensorFlow's built-in support. A full list of supported file systems and file formats by TensorFlow I/O can be found here.

The use of tensorflow-io is straightforward with keras. Below is an example to Get Started with TensorFlow with the data processing aspect replaced by tensorflow-io:

import tensorflow as tf
import tensorflow_io as tfio

# Read the MNIST data into the IODataset.
dataset_url = "https://storage.googleapis.com/cvdf-datasets/mnist/"
d_train = tfio.IODataset.from_mnist(
    dataset_url + "train-images-idx3-ubyte.gz",
    dataset_url + "train-labels-idx1-ubyte.gz",
)

# Shuffle the elements of the dataset.
d_train = d_train.shuffle(buffer_size=1024)

# By default image data is uint8, so convert to float32 using map().
d_train = d_train.map(lambda x, y: (tf.image.convert_image_dtype(x, tf.float32), y))

# prepare batches the data just like any other tf.data.Dataset
d_train = d_train.batch(32)

# Build the model.
model = tf.keras.models.Sequential(
    [
        tf.keras.layers.Flatten(input_shape=(28, 28)),
        tf.keras.layers.Dense(512, activation=tf.nn.relu),
        tf.keras.layers.Dropout(0.2),
        tf.keras.layers.Dense(10, activation=tf.nn.softmax),
    ]
)

# Compile the model.
model.compile(
    optimizer="adam", loss="sparse_categorical_crossentropy", metrics=["accuracy"]
)

# Fit the model.
model.fit(d_train, epochs=5, steps_per_epoch=200)

In the above MNIST example, the URL's to access the dataset files are passed directly to the tfio.IODataset.from_mnist API call. This is due to the inherent support that tensorflow-io provides for HTTP/HTTPS file system, thus eliminating the need for downloading and saving datasets on a local directory.

NOTE: Since tensorflow-io is able to detect and uncompress the MNIST dataset automatically if needed, we can pass the URL's for the compressed files (gzip) to the API call as is.

Please check the official documentation for more detailed and interesting usages of the package.

Installation

Python Package

The tensorflow-io Python package can be installed with pip directly using:

$ pip install tensorflow-io

People who are a little more adventurous can also try our nightly binaries:

$ pip install tensorflow-io-nightly

To ensure you have a version of TensorFlow that is compatible with TensorFlow-IO, you can specify the tensorflow extra requirement during install:

pip install tensorflow-io[tensorflow]

Similar extras exist for the tensorflow-gpu, tensorflow-cpu and tensorflow-rocm packages.

Docker Images

In addition to the pip packages, the docker images can be used to quickly get started.

For stable builds:

$ docker pull tfsigio/tfio:latest
$ docker run -it --rm --name tfio-latest tfsigio/tfio:latest

For nightly builds:

$ docker pull tfsigio/tfio:nightly
$ docker run -it --rm --name tfio-nightly tfsigio/tfio:nightly

R Package

Once the tensorflow-io Python package has been successfully installed, you can install the development version of the R package from GitHub via the following:

if (!require("remotes")) install.packages("remotes")
remotes::install_github("tensorflow/io", subdir = "R-package")

TensorFlow Version Compatibility

To ensure compatibility with TensorFlow, it is recommended to install a matching version of TensorFlow I/O according to the table below. You can find the list of releases here.

TensorFlow I/O Version TensorFlow Compatibility Release Date
0.26.0 2.9.x May 17, 2022
0.25.0 2.8.x Apr 19, 2022
0.24.0 2.8.x Feb 04, 2022
0.23.1 2.7.x Dec 15, 2021
0.23.0 2.7.x Dec 14, 2021
0.22.0 2.7.x Nov 10, 2021
0.21.0 2.6.x Sep 12, 2021
0.20.0 2.6.x Aug 11, 2021
0.19.1 2.5.x Jul 25, 2021
0.19.0 2.5.x Jun 25, 2021
0.18.0 2.5.x May 13, 2021
0.17.1 2.4.x Apr 16, 2021
0.17.0 2.4.x Dec 14, 2020
0.16.0 2.3.x Oct 23, 2020
0.15.0 2.3.x Aug 03, 2020
0.14.0 2.2.x Jul 08, 2020
0.13.0 2.2.x May 10, 2020
0.12.0 2.1.x Feb 28, 2020
0.11.0 2.1.x Jan 10, 2020
0.10.0 2.0.x Dec 05, 2019
0.9.1 2.0.x Nov 15, 2019
0.9.0 2.0.x Oct 18, 2019
0.8.1 1.15.x Nov 15, 2019
0.8.0 1.15.x Oct 17, 2019
0.7.2 1.14.x Nov 15, 2019
0.7.1 1.14.x Oct 18, 2019
0.7.0 1.14.x Jul 14, 2019
0.6.0 1.13.x May 29, 2019
0.5.0 1.13.x Apr 12, 2019
0.4.0 1.13.x Mar 01, 2019
0.3.0 1.12.0 Feb 15, 2019
0.2.0 1.12.0 Jan 29, 2019
0.1.0 1.12.0 Dec 16, 2018

Performance Benchmarking

We use github-pages to document the results of API performance benchmarks. The benchmark job is triggered on every commit to master branch and facilitates tracking performance w.r.t commits.

Contributing

Tensorflow I/O is a community led open source project. As such, the project depends on public contributions, bug-fixes, and documentation. Please see:

Build Status and CI

Build Status
Linux CPU Python 2 Status
Linux CPU Python 3 Status
Linux GPU Python 2 Status
Linux GPU Python 3 Status

Because of manylinux2010 requirement, TensorFlow I/O is built with Ubuntu:16.04 + Developer Toolset 7 (GCC 7.3) on Linux. Configuration with Ubuntu 16.04 with Developer Toolset 7 is not exactly straightforward. If the system have docker installed, then the following command will automatically build manylinux2010 compatible whl package:

#!/usr/bin/env bash

ls dist/*
for f in dist/*.whl; do
  docker run -i --rm -v $PWD:/v -w /v --net=host quay.io/pypa/manylinux2010_x86_64 bash -x -e /v/tools/build/auditwheel repair --plat manylinux2010_x86_64 $f
done
sudo chown -R $(id -nu):$(id -ng) .
ls wheelhouse/*

It takes some time to build, but once complete, there will be python 3.5, 3.6, 3.7 compatible whl packages available in wheelhouse directory.

On macOS, the same command could be used. However, the script expects python in shell and will only generate a whl package that matches the version of python in shell. If you want to build a whl package for a specific python then you have to alias this version of python to python in shell. See .github/workflows/build.yml Auditwheel step for instructions how to do that.

Note the above command is also the command we use when releasing packages for Linux and macOS.

TensorFlow I/O uses both GitHub Workflows and Google CI (Kokoro) for continuous integration. GitHub Workflows is used for macOS build and test. Kokoro is used for Linux build and test. Again, because of the manylinux2010 requirement, on Linux whl packages are always built with Ubuntu 16.04 + Developer Toolset 7. Tests are done on a variatiy of systems with different python3 versions to ensure a good coverage:

Python Ubuntu 18.04 Ubuntu 20.04 macOS + osx9 Windows-2019
2.7 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: N/A
3.7 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: :heavy_check_mark:
3.8 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: :heavy_check_mark:

TensorFlow I/O has integrations with many systems and cloud vendors such as Prometheus, Apache Kafka, Apache Ignite, Google Cloud PubSub, AWS Kinesis, Microsoft Azure Storage, Alibaba Cloud OSS etc.

We tried our best to test against those systems in our continuous integration whenever possible. Some tests such as Prometheus, Kafka, and Ignite are done with live systems, meaning we install Prometheus/Kafka/Ignite on CI machine before the test is run. Some tests such as Kinesis, PubSub, and Azure Storage are done through official or non-official emulators. Offline tests are also performed whenever possible, though systems covered through offine tests may not have the same level of coverage as live systems or emulators.

Live System Emulator CI Integration Offline
Apache Kafka :heavy_check_mark: :heavy_check_mark:
Apache Ignite :heavy_check_mark: :heavy_check_mark:
Prometheus :heavy_check_mark: :heavy_check_mark:
Google PubSub :heavy_check_mark: :heavy_check_mark:
Azure Storage :heavy_check_mark: :heavy_check_mark:
AWS Kinesis :heavy_check_mark: :heavy_check_mark:
Alibaba Cloud OSS :heavy_check_mark:
Google BigTable/BigQuery to be added
Elasticsearch (experimental) :heavy_check_mark: :heavy_check_mark:
MongoDB (experimental) :heavy_check_mark: :heavy_check_mark:

References for emulators:

Community

Additional Information

License

Apache License 2.0

Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

tensorflow_io_nightly-0.26.0.dev20220517052758-cp310-cp310-win_amd64.whl (22.5 MB view details)

Uploaded CPython 3.10 Windows x86-64

tensorflow_io_nightly-0.26.0.dev20220517052758-cp310-cp310-macosx_10_14_x86_64.whl (24.6 MB view details)

Uploaded CPython 3.10 macOS 10.14+ x86-64

tensorflow_io_nightly-0.26.0.dev20220517052758-cp39-cp39-win_amd64.whl (22.5 MB view details)

Uploaded CPython 3.9 Windows x86-64

tensorflow_io_nightly-0.26.0.dev20220517052758-cp39-cp39-macosx_10_14_x86_64.whl (24.6 MB view details)

Uploaded CPython 3.9 macOS 10.14+ x86-64

tensorflow_io_nightly-0.26.0.dev20220517052758-cp38-cp38-win_amd64.whl (22.5 MB view details)

Uploaded CPython 3.8 Windows x86-64

tensorflow_io_nightly-0.26.0.dev20220517052758-cp38-cp38-macosx_10_14_x86_64.whl (24.6 MB view details)

Uploaded CPython 3.8 macOS 10.14+ x86-64

tensorflow_io_nightly-0.26.0.dev20220517052758-cp37-cp37m-win_amd64.whl (22.5 MB view details)

Uploaded CPython 3.7m Windows x86-64

tensorflow_io_nightly-0.26.0.dev20220517052758-cp37-cp37m-macosx_10_14_x86_64.whl (24.6 MB view details)

Uploaded CPython 3.7m macOS 10.14+ x86-64

File details

Details for the file tensorflow_io_nightly-0.26.0.dev20220517052758-cp310-cp310-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.26.0.dev20220517052758-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 6401532a010242e6231654b112caf0e02e3da6029e69d639d61be179276d6f4a
MD5 c9e7cb376d747ec87030fe00c526bfba
BLAKE2b-256 b0a405602fe62c11e3c2baf615f6b4d26da37b7304f9a1ec22789ca906edee72

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.26.0.dev20220517052758-cp310-cp310-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.26.0.dev20220517052758-cp310-cp310-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 fabdf72b03bb907e64a0f1ca5997d7cdd22a35c7e2f2d6244c8a590c6dad0a51
MD5 8f4d762d3a99324ff7c8aa187d295f54
BLAKE2b-256 e90eb1a9db9198e1ffb77279dc311152ce2760c8db433712d877cbee6aa1a01f

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.26.0.dev20220517052758-cp310-cp310-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.26.0.dev20220517052758-cp310-cp310-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 7be83e4a2aa2fbc198693f2fc61fd9864f9605755e177d33f2331c1233972eec
MD5 4f074f1c84612d6537297960e2918090
BLAKE2b-256 e7b196c0da30a2c11017dce48dfadef5e62e35bb9f08c5ee432152594fc886c1

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.26.0.dev20220517052758-cp39-cp39-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.26.0.dev20220517052758-cp39-cp39-win_amd64.whl
Algorithm Hash digest
SHA256 c262c024b11142c39369eccf96f58803ef15674e813cc7ff31963b63e0c9c2bb
MD5 842ea05af7408508239294fed1ed15a6
BLAKE2b-256 95807137ec0e022afb14e21e2e4922e11b43507b41f88bfa9dc30048c27ade00

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.26.0.dev20220517052758-cp39-cp39-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.26.0.dev20220517052758-cp39-cp39-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 c6c193a8745827ffed50e70562067eff2249c6dd7041f9e52ce603b2651c4e24
MD5 87634c2c2f6793967b2e5f74d0d6d966
BLAKE2b-256 36657ffa25f649f2d989e652da64297b891e343b7aa6f8b5e40cd61b2618ac46

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.26.0.dev20220517052758-cp39-cp39-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.26.0.dev20220517052758-cp39-cp39-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 fe41be1192e71fb65cde1913e48c8ee400ac25d3821fa95581281557f777591c
MD5 f4dc1c0139ba3b77678d8e58d23ec5d4
BLAKE2b-256 376d67cb58649423122ef17c4fd6abc4dfc58f5e61d26bd4e5cb336579f0fea0

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.26.0.dev20220517052758-cp38-cp38-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.26.0.dev20220517052758-cp38-cp38-win_amd64.whl
Algorithm Hash digest
SHA256 5fa479ceffa9eccf84f55e203ccef5f1383f24c75993726488c86db2ed4e8371
MD5 ca6393dac0694d66716d34246899f262
BLAKE2b-256 f395c56b0214a2c5410e26b7c28d5434451ba710693020616d75ba6c239c5875

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.26.0.dev20220517052758-cp38-cp38-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.26.0.dev20220517052758-cp38-cp38-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 883bb81448707c76a3acdc51be14fb3b655475e556d5da73ab10af0388705004
MD5 18b0be821b07e666de1ae4617b669a1d
BLAKE2b-256 ba39e439f76ae13a379ce39f4e5dc6f5d81f7efc9ce8747dd65f8e176feca5a1

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.26.0.dev20220517052758-cp38-cp38-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.26.0.dev20220517052758-cp38-cp38-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 a63d87e19f19e3842b95c0fddcce3c984cd9d573dcdcc64722e6b8aaeefc4b3b
MD5 3bcd33089bef1f627087fd314f9190c3
BLAKE2b-256 fa64a2570b202adb7107d5204b8c2c4141d96274db569cc46f8fc1e97cdaa4ed

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.26.0.dev20220517052758-cp37-cp37m-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.26.0.dev20220517052758-cp37-cp37m-win_amd64.whl
Algorithm Hash digest
SHA256 c977b2c278183c2a004aa58aaabdbc4e3a2da8b2c40b8ebac878e80cc4d24d53
MD5 69cb6245643786ef3a88bed10dbc49d9
BLAKE2b-256 31750b1895b3601babf401883949884faaadef81e6a8734319292b2a963087f3

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.26.0.dev20220517052758-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.26.0.dev20220517052758-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 b0e8c596987e42bb1eadca8c933f3c567576b32b83551cbdad01d09131485c64
MD5 f297b3cbe398c9c1d86da3f2875ac0e0
BLAKE2b-256 aaa7047dd54efc318ed3b2488ce4d04b9dffe9c99bb8c6ec2bf4851c6eb21f7b

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.26.0.dev20220517052758-cp37-cp37m-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.26.0.dev20220517052758-cp37-cp37m-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 21d235c3b20d304a1c07601472138cae71c9c43270d107a8092c657b0efb935c
MD5 eda9f202d84822c56d69263edd663abf
BLAKE2b-256 2a12be94aca50eb39594922165a56651347409f883f2013a9eaaeacc8641a18f

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page