Skip to main content

TensorFlow IO

Project description




TensorFlow I/O

GitHub CI PyPI License Documentation

TensorFlow I/O is a collection of file systems and file formats that are not available in TensorFlow's built-in support. A full list of supported file systems and file formats by TensorFlow I/O can be found here.

The use of tensorflow-io is straightforward with keras. Below is an example to Get Started with TensorFlow with the data processing aspect replaced by tensorflow-io:

import tensorflow as tf
import tensorflow_io as tfio

# Read the MNIST data into the IODataset.
dataset_url = "https://storage.googleapis.com/cvdf-datasets/mnist/"
d_train = tfio.IODataset.from_mnist(
    dataset_url + "train-images-idx3-ubyte.gz",
    dataset_url + "train-labels-idx1-ubyte.gz",
)

# Shuffle the elements of the dataset.
d_train = d_train.shuffle(buffer_size=1024)

# By default image data is uint8, so convert to float32 using map().
d_train = d_train.map(lambda x, y: (tf.image.convert_image_dtype(x, tf.float32), y))

# prepare batches the data just like any other tf.data.Dataset
d_train = d_train.batch(32)

# Build the model.
model = tf.keras.models.Sequential(
    [
        tf.keras.layers.Flatten(input_shape=(28, 28)),
        tf.keras.layers.Dense(512, activation=tf.nn.relu),
        tf.keras.layers.Dropout(0.2),
        tf.keras.layers.Dense(10, activation=tf.nn.softmax),
    ]
)

# Compile the model.
model.compile(
    optimizer="adam", loss="sparse_categorical_crossentropy", metrics=["accuracy"]
)

# Fit the model.
model.fit(d_train, epochs=5, steps_per_epoch=200)

In the above MNIST example, the URL's to access the dataset files are passed directly to the tfio.IODataset.from_mnist API call. This is due to the inherent support that tensorflow-io provides for HTTP/HTTPS file system, thus eliminating the need for downloading and saving datasets on a local directory.

NOTE: Since tensorflow-io is able to detect and uncompress the MNIST dataset automatically if needed, we can pass the URL's for the compressed files (gzip) to the API call as is.

Please check the official documentation for more detailed and interesting usages of the package.

Installation

Python Package

The tensorflow-io Python package can be installed with pip directly using:

$ pip install tensorflow-io

People who are a little more adventurous can also try our nightly binaries:

$ pip install tensorflow-io-nightly

Docker Images

In addition to the pip packages, the docker images can be used to quickly get started.

For stable builds:

$ docker pull tfsigio/tfio:latest
$ docker run -it --rm --name tfio-latest tfsigio/tfio:latest

For nightly builds:

$ docker pull tfsigio/tfio:nightly
$ docker run -it --rm --name tfio-nightly tfsigio/tfio:nightly

R Package

Once the tensorflow-io Python package has been successfully installed, you can install the development version of the R package from GitHub via the following:

if (!require("remotes")) install.packages("remotes")
remotes::install_github("tensorflow/io", subdir = "R-package")

TensorFlow Version Compatibility

To ensure compatibility with TensorFlow, it is recommended to install a matching version of TensorFlow I/O according to the table below. You can find the list of releases here.

TensorFlow I/O Version TensorFlow Compatibility Release Date
0.17.1 2.4.x Apr 16, 2021
0.17.0 2.4.x Dec 14, 2020
0.16.0 2.3.x Oct 23, 2020
0.15.0 2.3.x Aug 03, 2020
0.14.0 2.2.x Jul 08, 2020
0.13.0 2.2.x May 10, 2020
0.12.0 2.1.x Feb 28, 2020
0.11.0 2.1.x Jan 10, 2020
0.10.0 2.0.x Dec 05, 2019
0.9.1 2.0.x Nov 15, 2019
0.9.0 2.0.x Oct 18, 2019
0.8.1 1.15.x Nov 15, 2019
0.8.0 1.15.x Oct 17, 2019
0.7.2 1.14.x Nov 15, 2019
0.7.1 1.14.x Oct 18, 2019
0.7.0 1.14.x Jul 14, 2019
0.6.0 1.13.x May 29, 2019
0.5.0 1.13.x Apr 12, 2019
0.4.0 1.13.x Mar 01, 2019
0.3.0 1.12.0 Feb 15, 2019
0.2.0 1.12.0 Jan 29, 2019
0.1.0 1.12.0 Dec 16, 2018

Performance Benchmarking

We use github-pages to document the results of API performance benchmarks. The benchmark job is triggered on every commit to master branch and facilitates tracking performance w.r.t commits.

Contributing

Tensorflow I/O is a community led open source project. As such, the project depends on public contributions, bug-fixes, and documentation. Please see:

Build Status and CI

Build Status
Linux CPU Python 2 Status
Linux CPU Python 3 Status
Linux GPU Python 2 Status
Linux GPU Python 3 Status

Because of manylinux2010 requirement, TensorFlow I/O is built with Ubuntu:16.04 + Developer Toolset 7 (GCC 7.3) on Linux. Configuration with Ubuntu 16.04 with Developer Toolset 7 is not exactly straightforward. If the system have docker installed, then the following command will automatically build manylinux2010 compatible whl package:

#!/usr/bin/env bash

ls dist/*
for f in dist/*.whl; do
  docker run -i --rm -v $PWD:/v -w /v --net=host quay.io/pypa/manylinux2010_x86_64 bash -x -e /v/tools/build/auditwheel repair --plat manylinux2010_x86_64 $f
done
sudo chown -R $(id -nu):$(id -ng) .
ls wheelhouse/*

It takes some time to build, but once complete, there will be python 3.5, 3.6, 3.7 compatible whl packages available in wheelhouse directory.

On macOS, the same command could be used. However, the script expects python in shell and will only generate a whl package that matches the version of python in shell. If you want to build a whl package for a specific python then you have to alias this version of python to python in shell. See .github/workflows/build.yml Auditwheel step for instructions how to do that.

Note the above command is also the command we use when releasing packages for Linux and macOS.

TensorFlow I/O uses both GitHub Workflows and Google CI (Kokoro) for continuous integration. GitHub Workflows is used for macOS build and test. Kokoro is used for Linux build and test. Again, because of the manylinux2010 requirement, on Linux whl packages are always built with Ubuntu 16.04 + Developer Toolset 7. Tests are done on a variatiy of systems with different python3 versions to ensure a good coverage:

Python Ubuntu 18.04 Ubuntu 20.04 macOS + osx9 Windows-2019
2.7 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: N/A
3.7 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: :heavy_check_mark:
3.8 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: :heavy_check_mark:

TensorFlow I/O has integrations with many systems and cloud vendors such as Prometheus, Apache Kafka, Apache Ignite, Google Cloud PubSub, AWS Kinesis, Microsoft Azure Storage, Alibaba Cloud OSS etc.

We tried our best to test against those systems in our continuous integration whenever possible. Some tests such as Prometheus, Kafka, and Ignite are done with live systems, meaning we install Prometheus/Kafka/Ignite on CI machine before the test is run. Some tests such as Kinesis, PubSub, and Azure Storage are done through official or non-official emulators. Offline tests are also performed whenever possible, though systems covered through offine tests may not have the same level of coverage as live systems or emulators.

Live System Emulator CI Integration Offline
Apache Kafka :heavy_check_mark: :heavy_check_mark:
Apache Ignite :heavy_check_mark: :heavy_check_mark:
Prometheus :heavy_check_mark: :heavy_check_mark:
Google PubSub :heavy_check_mark: :heavy_check_mark:
Azure Storage :heavy_check_mark: :heavy_check_mark:
AWS Kinesis :heavy_check_mark: :heavy_check_mark:
Alibaba Cloud OSS :heavy_check_mark:
Google BigTable/BigQuery to be added
Elasticsearch (experimental) :heavy_check_mark: :heavy_check_mark:
MongoDB (experimental) :heavy_check_mark: :heavy_check_mark:

References for emulators:

Community

Additional Information

License

Apache License 2.0

Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

tensorflow_io_nightly-0.18.0.dev20210430205854-cp39-cp39-win_amd64.whl (20.3 MB view details)

Uploaded CPython 3.9 Windows x86-64

tensorflow_io_nightly-0.18.0.dev20210430205854-cp39-cp39-manylinux2010_x86_64.whl (23.7 MB view details)

Uploaded CPython 3.9 manylinux: glibc 2.12+ x86-64

tensorflow_io_nightly-0.18.0.dev20210430205854-cp39-cp39-macosx_10_14_x86_64.whl (20.7 MB view details)

Uploaded CPython 3.9 macOS 10.14+ x86-64

tensorflow_io_nightly-0.18.0.dev20210430205854-cp38-cp38-win_amd64.whl (20.3 MB view details)

Uploaded CPython 3.8 Windows x86-64

tensorflow_io_nightly-0.18.0.dev20210430205854-cp38-cp38-manylinux2010_x86_64.whl (23.7 MB view details)

Uploaded CPython 3.8 manylinux: glibc 2.12+ x86-64

tensorflow_io_nightly-0.18.0.dev20210430205854-cp38-cp38-macosx_10_14_x86_64.whl (20.7 MB view details)

Uploaded CPython 3.8 macOS 10.14+ x86-64

tensorflow_io_nightly-0.18.0.dev20210430205854-cp37-cp37m-win_amd64.whl (20.3 MB view details)

Uploaded CPython 3.7m Windows x86-64

tensorflow_io_nightly-0.18.0.dev20210430205854-cp37-cp37m-manylinux2010_x86_64.whl (23.7 MB view details)

Uploaded CPython 3.7m manylinux: glibc 2.12+ x86-64

tensorflow_io_nightly-0.18.0.dev20210430205854-cp37-cp37m-macosx_10_14_x86_64.whl (20.7 MB view details)

Uploaded CPython 3.7m macOS 10.14+ x86-64

tensorflow_io_nightly-0.18.0.dev20210430205854-cp36-cp36m-win_amd64.whl (20.3 MB view details)

Uploaded CPython 3.6m Windows x86-64

tensorflow_io_nightly-0.18.0.dev20210430205854-cp36-cp36m-manylinux2010_x86_64.whl (23.7 MB view details)

Uploaded CPython 3.6m manylinux: glibc 2.12+ x86-64

tensorflow_io_nightly-0.18.0.dev20210430205854-cp36-cp36m-macosx_10_14_x86_64.whl (20.7 MB view details)

Uploaded CPython 3.6m macOS 10.14+ x86-64

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210430205854-cp39-cp39-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210430205854-cp39-cp39-win_amd64.whl
Algorithm Hash digest
SHA256 cc51d91eba76910f77ac49fdfe76632474a1f7be95b1548648005b09092c25ec
MD5 7af643c184cfa3e49c9de0800791877e
BLAKE2b-256 d07f7c3467da22488f91018c6a9af3c4c943889160760be0a08483f2ac8ea2b0

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210430205854-cp39-cp39-manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210430205854-cp39-cp39-manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 e14fa5d70776c73b6b61f3a7b69357a089230a0cf42531ab700d19fc8cbc18dc
MD5 4f81837b4611a66e51914f85ca0f4800
BLAKE2b-256 7b75261ce978d94e517840664402998ac07e72f495116af844cc83d4e5548477

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210430205854-cp39-cp39-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210430205854-cp39-cp39-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 89135a85e9a0c049dca0e22c57925dd912f61518a004def923b609e367985e11
MD5 236f49e607f183bb42777526251405fa
BLAKE2b-256 ab30a9ac7c16cd1f34c2872aae615e13703a59f5378cae84ee5e0123b159f082

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210430205854-cp38-cp38-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210430205854-cp38-cp38-win_amd64.whl
Algorithm Hash digest
SHA256 6ca256040ce8fd964387dc8a42ffd29900f3e14f1b2d667c478505402327e549
MD5 ee2c26f7c6ce040bb29864415d5492a1
BLAKE2b-256 3cd41839d9c5f5e3d786146bbe5f1bf383a8df64b2237f5650db0be2d400288b

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210430205854-cp38-cp38-manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210430205854-cp38-cp38-manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 91dec6fcf3c45412043f0b8819b8fee816936ed3aa04d77d5a809109f9f32eb8
MD5 c5e33445a620d9c252fc89db4c8e20ce
BLAKE2b-256 617f43fba2a8a462d4ece81c111c3c2442396938e64cada0b776caa4b09f1084

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210430205854-cp38-cp38-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210430205854-cp38-cp38-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 4cc60f898bf3e16ca4890c4e1567b2e3ddc5ba139cce86576ac4559a0796ce31
MD5 1af243756597471bb1be2b707999bf14
BLAKE2b-256 cc230d69c72807309dffa6c26dce5e48636e9e72181e1687325b5283dcb30753

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210430205854-cp37-cp37m-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210430205854-cp37-cp37m-win_amd64.whl
Algorithm Hash digest
SHA256 6857ea0837dd0b41baae60c2c515010b927931eaab549a0e892d6243d7bcfa73
MD5 e483084b5de5fa49158c0bb54cf5d70f
BLAKE2b-256 c49876bfa6a3e19ff30e93cdfda065b295c65cc9571bf271a496e8bd060a800a

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210430205854-cp37-cp37m-manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210430205854-cp37-cp37m-manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 803d8682411818cc02edba2aaf300da2b1dda50d7d251c7f980f4e1b2e1a0093
MD5 42653ee533439a82773f79149a612b5f
BLAKE2b-256 2d6d324744c25b371f6f5df8bd0d9a1d61b162ccd585c31bdcedcb9caf4de91c

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210430205854-cp37-cp37m-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210430205854-cp37-cp37m-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 5c8bfbaaaa0daad1fac27027e89a2e9328cfb110fa20e664ed68886846cbe4df
MD5 abade5fcecdf10739431042c68b2764b
BLAKE2b-256 0df11d55d1202814eb91325bac1db3a828bed39ff1c79b449457ab8246c43466

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210430205854-cp36-cp36m-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210430205854-cp36-cp36m-win_amd64.whl
Algorithm Hash digest
SHA256 4c3e589390439c954e4c43d872915460cf7781d23d65ef8ea13e16156af8728f
MD5 0eaea164c77f4485d7685b4d01a4813f
BLAKE2b-256 fbc3712bafdd22118f408a286388cbc9c138fa0b09bd2980dd0e9e50215f2779

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210430205854-cp36-cp36m-manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210430205854-cp36-cp36m-manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 908a9193d41563f522456e7a0fc939ed917e40289c46e995da96565e073e96dd
MD5 d87732c2a47d538a2339a180819a3c40
BLAKE2b-256 64cb83d4d03c854074636c7f367a0fb02d12acbe5d4ac6158cb13807aa041ba5

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210430205854-cp36-cp36m-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210430205854-cp36-cp36m-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 59da508f4c7f547b64b7eab2d4e6c1b13f6a494c62432eab7b53b0282e61d040
MD5 394ded3830b575f34350fdec5d804f84
BLAKE2b-256 cb765a65e73a6dd08cfc5eb3d9a5e3e5aa4c7d234d55e04df43bdf936b2ebe7b

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page