Skip to main content

TensorFlow IO

Project description




TensorFlow I/O

GitHub CI PyPI License Documentation

TensorFlow I/O is a collection of file systems and file formats that are not available in TensorFlow's built-in support. A full list of supported file systems and file formats by TensorFlow I/O can be found here.

The use of tensorflow-io is straightforward with keras. Below is an example to Get Started with TensorFlow with the data processing aspect replaced by tensorflow-io:

import tensorflow as tf
import tensorflow_io as tfio

# Read the MNIST data into the IODataset.
dataset_url = "https://storage.googleapis.com/cvdf-datasets/mnist/"
d_train = tfio.IODataset.from_mnist(
    dataset_url + "train-images-idx3-ubyte.gz",
    dataset_url + "train-labels-idx1-ubyte.gz",
)

# Shuffle the elements of the dataset.
d_train = d_train.shuffle(buffer_size=1024)

# By default image data is uint8, so convert to float32 using map().
d_train = d_train.map(lambda x, y: (tf.image.convert_image_dtype(x, tf.float32), y))

# prepare batches the data just like any other tf.data.Dataset
d_train = d_train.batch(32)

# Build the model.
model = tf.keras.models.Sequential(
    [
        tf.keras.layers.Flatten(input_shape=(28, 28)),
        tf.keras.layers.Dense(512, activation=tf.nn.relu),
        tf.keras.layers.Dropout(0.2),
        tf.keras.layers.Dense(10, activation=tf.nn.softmax),
    ]
)

# Compile the model.
model.compile(
    optimizer="adam", loss="sparse_categorical_crossentropy", metrics=["accuracy"]
)

# Fit the model.
model.fit(d_train, epochs=5, steps_per_epoch=200)

In the above MNIST example, the URL's to access the dataset files are passed directly to the tfio.IODataset.from_mnist API call. This is due to the inherent support that tensorflow-io provides for HTTP/HTTPS file system, thus eliminating the need for downloading and saving datasets on a local directory.

NOTE: Since tensorflow-io is able to detect and uncompress the MNIST dataset automatically if needed, we can pass the URL's for the compressed files (gzip) to the API call as is.

Please check the official documentation for more detailed and interesting usages of the package.

Installation

Python Package

The tensorflow-io Python package can be installed with pip directly using:

$ pip install tensorflow-io

People who are a little more adventurous can also try our nightly binaries:

$ pip install tensorflow-io-nightly

To ensure you have a version of TensorFlow that is compatible with TensorFlow-IO, you can specify the tensorflow extra requirement during install:

pip install tensorflow-io[tensorflow]

Similar extras exist for the tensorflow-gpu, tensorflow-cpu and tensorflow-rocm packages.

Docker Images

In addition to the pip packages, the docker images can be used to quickly get started.

For stable builds:

$ docker pull tfsigio/tfio:latest
$ docker run -it --rm --name tfio-latest tfsigio/tfio:latest

For nightly builds:

$ docker pull tfsigio/tfio:nightly
$ docker run -it --rm --name tfio-nightly tfsigio/tfio:nightly

R Package

Once the tensorflow-io Python package has been successfully installed, you can install the development version of the R package from GitHub via the following:

if (!require("remotes")) install.packages("remotes")
remotes::install_github("tensorflow/io", subdir = "R-package")

TensorFlow Version Compatibility

To ensure compatibility with TensorFlow, it is recommended to install a matching version of TensorFlow I/O according to the table below. You can find the list of releases here.

TensorFlow I/O Version TensorFlow Compatibility Release Date
0.23.0 2.7.x Dec 14, 2021
0.22.0 2.7.x Nov 10, 2021
0.21.0 2.6.x Sep 12, 2021
0.20.0 2.6.x Aug 11, 2021
0.19.1 2.5.x Jul 25, 2021
0.19.0 2.5.x Jun 25, 2021
0.18.0 2.5.x May 13, 2021
0.17.1 2.4.x Apr 16, 2021
0.17.0 2.4.x Dec 14, 2020
0.16.0 2.3.x Oct 23, 2020
0.15.0 2.3.x Aug 03, 2020
0.14.0 2.2.x Jul 08, 2020
0.13.0 2.2.x May 10, 2020
0.12.0 2.1.x Feb 28, 2020
0.11.0 2.1.x Jan 10, 2020
0.10.0 2.0.x Dec 05, 2019
0.9.1 2.0.x Nov 15, 2019
0.9.0 2.0.x Oct 18, 2019
0.8.1 1.15.x Nov 15, 2019
0.8.0 1.15.x Oct 17, 2019
0.7.2 1.14.x Nov 15, 2019
0.7.1 1.14.x Oct 18, 2019
0.7.0 1.14.x Jul 14, 2019
0.6.0 1.13.x May 29, 2019
0.5.0 1.13.x Apr 12, 2019
0.4.0 1.13.x Mar 01, 2019
0.3.0 1.12.0 Feb 15, 2019
0.2.0 1.12.0 Jan 29, 2019
0.1.0 1.12.0 Dec 16, 2018

Performance Benchmarking

We use github-pages to document the results of API performance benchmarks. The benchmark job is triggered on every commit to master branch and facilitates tracking performance w.r.t commits.

Contributing

Tensorflow I/O is a community led open source project. As such, the project depends on public contributions, bug-fixes, and documentation. Please see:

Build Status and CI

Build Status
Linux CPU Python 2 Status
Linux CPU Python 3 Status
Linux GPU Python 2 Status
Linux GPU Python 3 Status

Because of manylinux2010 requirement, TensorFlow I/O is built with Ubuntu:16.04 + Developer Toolset 7 (GCC 7.3) on Linux. Configuration with Ubuntu 16.04 with Developer Toolset 7 is not exactly straightforward. If the system have docker installed, then the following command will automatically build manylinux2010 compatible whl package:

#!/usr/bin/env bash

ls dist/*
for f in dist/*.whl; do
  docker run -i --rm -v $PWD:/v -w /v --net=host quay.io/pypa/manylinux2010_x86_64 bash -x -e /v/tools/build/auditwheel repair --plat manylinux2010_x86_64 $f
done
sudo chown -R $(id -nu):$(id -ng) .
ls wheelhouse/*

It takes some time to build, but once complete, there will be python 3.5, 3.6, 3.7 compatible whl packages available in wheelhouse directory.

On macOS, the same command could be used. However, the script expects python in shell and will only generate a whl package that matches the version of python in shell. If you want to build a whl package for a specific python then you have to alias this version of python to python in shell. See .github/workflows/build.yml Auditwheel step for instructions how to do that.

Note the above command is also the command we use when releasing packages for Linux and macOS.

TensorFlow I/O uses both GitHub Workflows and Google CI (Kokoro) for continuous integration. GitHub Workflows is used for macOS build and test. Kokoro is used for Linux build and test. Again, because of the manylinux2010 requirement, on Linux whl packages are always built with Ubuntu 16.04 + Developer Toolset 7. Tests are done on a variatiy of systems with different python3 versions to ensure a good coverage:

Python Ubuntu 18.04 Ubuntu 20.04 macOS + osx9 Windows-2019
2.7 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: N/A
3.7 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: :heavy_check_mark:
3.8 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: :heavy_check_mark:

TensorFlow I/O has integrations with many systems and cloud vendors such as Prometheus, Apache Kafka, Apache Ignite, Google Cloud PubSub, AWS Kinesis, Microsoft Azure Storage, Alibaba Cloud OSS etc.

We tried our best to test against those systems in our continuous integration whenever possible. Some tests such as Prometheus, Kafka, and Ignite are done with live systems, meaning we install Prometheus/Kafka/Ignite on CI machine before the test is run. Some tests such as Kinesis, PubSub, and Azure Storage are done through official or non-official emulators. Offline tests are also performed whenever possible, though systems covered through offine tests may not have the same level of coverage as live systems or emulators.

Live System Emulator CI Integration Offline
Apache Kafka :heavy_check_mark: :heavy_check_mark:
Apache Ignite :heavy_check_mark: :heavy_check_mark:
Prometheus :heavy_check_mark: :heavy_check_mark:
Google PubSub :heavy_check_mark: :heavy_check_mark:
Azure Storage :heavy_check_mark: :heavy_check_mark:
AWS Kinesis :heavy_check_mark: :heavy_check_mark:
Alibaba Cloud OSS :heavy_check_mark:
Google BigTable/BigQuery to be added
Elasticsearch (experimental) :heavy_check_mark: :heavy_check_mark:
MongoDB (experimental) :heavy_check_mark: :heavy_check_mark:

References for emulators:

Community

Additional Information

License

Apache License 2.0

Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

tensorflow_io_nightly-0.23.0.dev20211215004050-cp310-cp310-win_amd64.whl (21.4 MB view details)

Uploaded CPython 3.10 Windows x86-64

tensorflow_io_nightly-0.23.0.dev20211215004050-cp310-cp310-macosx_10_14_x86_64.whl (23.8 MB view details)

Uploaded CPython 3.10 macOS 10.14+ x86-64

tensorflow_io_nightly-0.23.0.dev20211215004050-cp39-cp39-win_amd64.whl (21.4 MB view details)

Uploaded CPython 3.9 Windows x86-64

tensorflow_io_nightly-0.23.0.dev20211215004050-cp39-cp39-macosx_10_14_x86_64.whl (23.8 MB view details)

Uploaded CPython 3.9 macOS 10.14+ x86-64

tensorflow_io_nightly-0.23.0.dev20211215004050-cp38-cp38-win_amd64.whl (21.4 MB view details)

Uploaded CPython 3.8 Windows x86-64

tensorflow_io_nightly-0.23.0.dev20211215004050-cp38-cp38-macosx_10_14_x86_64.whl (23.8 MB view details)

Uploaded CPython 3.8 macOS 10.14+ x86-64

tensorflow_io_nightly-0.23.0.dev20211215004050-cp37-cp37m-win_amd64.whl (21.4 MB view details)

Uploaded CPython 3.7m Windows x86-64

tensorflow_io_nightly-0.23.0.dev20211215004050-cp37-cp37m-macosx_10_14_x86_64.whl (23.8 MB view details)

Uploaded CPython 3.7m macOS 10.14+ x86-64

File details

Details for the file tensorflow_io_nightly-0.23.0.dev20211215004050-cp310-cp310-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.23.0.dev20211215004050-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 82b027ae718615da34c2456e31e0a04f25d825ffd59e167448a0d2beff8dbbaa
MD5 7935ed2efa0d707107145f7857c84751
BLAKE2b-256 c5c82bcee3f0f8bc47adae04dd47a9e76d507a417614fd580cabf39ae4cb87fa

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.23.0.dev20211215004050-cp310-cp310-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.23.0.dev20211215004050-cp310-cp310-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 758ead5f5dc3db79cd4531a9f769ead6ac82f63ca16f53e19e6e8f95b700e4db
MD5 788961d5fe26ddc1dfc0111a63891a19
BLAKE2b-256 9e9c3653939e4505d97360d2887e9fc9f3175f4db685ef642323d22eae05e835

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.23.0.dev20211215004050-cp310-cp310-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.23.0.dev20211215004050-cp310-cp310-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 36f14ff8d754ceb1b3cd2dc751f422d01d9857f72a715c5fd28a35b861788b1a
MD5 d2b8a8bfe2a4f08592215c3afa132d98
BLAKE2b-256 1881870bc8322e5e7afbaedfdb1923b25f67f063a5423ca249b40d408c9119a0

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.23.0.dev20211215004050-cp39-cp39-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.23.0.dev20211215004050-cp39-cp39-win_amd64.whl
Algorithm Hash digest
SHA256 80398e70d5e4320365b75c07584fbd2d341d10aba4f65628049bf92df5be995a
MD5 87ff8ddb8622e473957cbec287977962
BLAKE2b-256 ecdbbbde3c346d7b359286a3b45f83efdd5cf897136425cdbf7cb77109c7803b

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.23.0.dev20211215004050-cp39-cp39-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.23.0.dev20211215004050-cp39-cp39-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 bdb70cdd8f543335b81fa04388107ede772956151192ad17a14d96b5fa082258
MD5 46c0dcd3a4580f366a86453e4130b1dd
BLAKE2b-256 7debc24d8e3ac93c8fa2bfae1e8c4fed3f5fa6ae75f9203e418791f025f7c2c8

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.23.0.dev20211215004050-cp39-cp39-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.23.0.dev20211215004050-cp39-cp39-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 eb1f5b03bc1e33133d0c725ea9895451fc6c38ee6e9b3130e96e779cf4baea5d
MD5 0f2ba6871508f28e6acedff86022b29f
BLAKE2b-256 ce93ee38a7ff965bab6931389f9fd01f8dc4c2b46ab6481d70fcdeb22c78fe0b

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.23.0.dev20211215004050-cp38-cp38-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.23.0.dev20211215004050-cp38-cp38-win_amd64.whl
Algorithm Hash digest
SHA256 00436dd156f215f6d9a53f1fc23ced16ccf1ede15f6722bc5d34a7b1db034f7e
MD5 447f52cd3bc974dd32da70023d5677be
BLAKE2b-256 1b68d187e1fad8dd374709a8f955537eecbfd511c811210fa00103ece6e0c054

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.23.0.dev20211215004050-cp38-cp38-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.23.0.dev20211215004050-cp38-cp38-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 f31b5ed476c6c7185ba5cb37f3102a72c00a9edbdec7f1f70d7a5213aafba339
MD5 bd347df55eeb886705307abc8fe4becb
BLAKE2b-256 5fe29de79b4cbc82753adb4f0a70728af6266057d71fba331cfe7e44f6d2fbe0

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.23.0.dev20211215004050-cp38-cp38-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.23.0.dev20211215004050-cp38-cp38-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 f76fb1de688067729b21c24d6deee9541a09b308c970a88642dc45d9002a2ac0
MD5 f4a24ae11d02542452c9d09daa538119
BLAKE2b-256 ada16621688bd4205d248f7aba2e0e83e2dabf94ebcb26db26555e8db9768483

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.23.0.dev20211215004050-cp37-cp37m-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.23.0.dev20211215004050-cp37-cp37m-win_amd64.whl
Algorithm Hash digest
SHA256 77993ce4b577dd8e0fbfa3d103775f603c8f55e56bc1f60db2e5f79afe3f5330
MD5 d65480fd9f21e32c4eaeab730d604884
BLAKE2b-256 7258efb8f6ed33c45312884f11650d09952a2da0ffa0d24171e32c2fa063fa4f

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.23.0.dev20211215004050-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.23.0.dev20211215004050-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 bc09ab878764e88b9f34a4ad503c3ac80a2c62c2469b302d77d1f6c079b62a9d
MD5 11e57f70948f4bf5559127a4c4dd6779
BLAKE2b-256 65d8c68363a809e9689ccc053ee7d1db91249cf82c589d15597263315e150a5e

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.23.0.dev20211215004050-cp37-cp37m-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.23.0.dev20211215004050-cp37-cp37m-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 54cd42dae188f22281292c00125dda221d82314a6ccaab6897d08a5baf830fd3
MD5 3a1d9b2629a18e1976a594dd69a67654
BLAKE2b-256 cc1cba0b20724ac899d48078ae918c107eba5c38305aba1dcb3c79c5bdc577e5

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page