Skip to main content

TensorFlow IO

Project description




TensorFlow I/O

GitHub CI PyPI License Documentation

TensorFlow I/O is a collection of file systems and file formats that are not available in TensorFlow's built-in support. A full list of supported file systems and file formats by TensorFlow I/O can be found here.

The use of tensorflow-io is straightforward with keras. Below is an example to Get Started with TensorFlow with the data processing aspect replaced by tensorflow-io:

import tensorflow as tf
import tensorflow_io as tfio

# Read the MNIST data into the IODataset.
dataset_url = "http://storage.googleapis.com/cvdf-datasets/mnist/"
d_train = tfio.IODataset.from_mnist(
    dataset_url + "train-images-idx3-ubyte.gz",
    dataset_url + "train-labels-idx1-ubyte.gz",
)

# Shuffle the elements of the dataset.
d_train = d_train.shuffle(buffer_size=1024)

# By default image data is uint8, so convert to float32 using map().
d_train = d_train.map(lambda x, y: (tf.image.convert_image_dtype(x, tf.float32), y))

# prepare batches the data just like any other tf.data.Dataset
d_train = d_train.batch(32)

# Build the model.
model = tf.keras.models.Sequential(
    [
        tf.keras.layers.Flatten(input_shape=(28, 28)),
        tf.keras.layers.Dense(512, activation=tf.nn.relu),
        tf.keras.layers.Dropout(0.2),
        tf.keras.layers.Dense(10, activation=tf.nn.softmax),
    ]
)

# Compile the model.
model.compile(
    optimizer="adam", loss="sparse_categorical_crossentropy", metrics=["accuracy"]
)

# Fit the model.
model.fit(d_train, epochs=5, steps_per_epoch=200)

In the above MNIST example, the URL's to access the dataset files are passed directly to the tfio.IODataset.from_mnist API call. This is due to the inherent support that tensorflow-io provides for the HTTP file system, thus eliminating the need for downloading and saving datasets on a local directory.

NOTE: Since tensorflow-io is able to detect and uncompress the MNIST dataset automatically if needed, we can pass the URL's for the compressed files (gzip) to the API call as is.

Please check the official documentation for more detailed and interesting usages of the package.

Installation

Python Package

The tensorflow-io Python package can be installed with pip directly using:

$ pip install tensorflow-io

People who are a little more adventurous can also try our nightly binaries:

$ pip install tensorflow-io-nightly

Docker Images

In addition to the pip packages, the docker images can be used to quickly get started.

For stable builds:

$ docker pull tfsigio/tfio:latest
$ docker run -it --rm --name tfio-latest tfsigio/tfio:latest

For nightly builds:

$ docker pull tfsigio/tfio:nightly
$ docker run -it --rm --name tfio-nightly tfsigio/tfio:nightly

R Package

Once the tensorflow-io Python package has been successfully installed, you can install the development version of the R package from GitHub via the following:

if (!require("remotes")) install.packages("remotes")
remotes::install_github("tensorflow/io", subdir = "R-package")

TensorFlow Version Compatibility

To ensure compatibility with TensorFlow, it is recommended to install a matching version of TensorFlow I/O according to the table below. You can find the list of releases here.

TensorFlow I/O Version TensorFlow Compatibility Release Date
0.17.0 2.4.x Dec 14, 2020
0.16.0 2.3.x Oct 23, 2020
0.15.0 2.3.x Aug 03, 2020
0.14.0 2.2.x Jul 08, 2020
0.13.0 2.2.x May 10, 2020
0.12.0 2.1.x Feb 28, 2020
0.11.0 2.1.x Jan 10, 2020
0.10.0 2.0.x Dec 05, 2019
0.9.1 2.0.x Nov 15, 2019
0.9.0 2.0.x Oct 18, 2019
0.8.1 1.15.x Nov 15, 2019
0.8.0 1.15.x Oct 17, 2019
0.7.2 1.14.x Nov 15, 2019
0.7.1 1.14.x Oct 18, 2019
0.7.0 1.14.x Jul 14, 2019
0.6.0 1.13.x May 29, 2019
0.5.0 1.13.x Apr 12, 2019
0.4.0 1.13.x Mar 01, 2019
0.3.0 1.12.0 Feb 15, 2019
0.2.0 1.12.0 Jan 29, 2019
0.1.0 1.12.0 Dec 16, 2018

Performance Benchmarking

We use github-pages to document the results of API performance benchmarks. The benchmark job is triggered on every commit to master branch and facilitates tracking performance w.r.t commits.

Contributing

Tensorflow I/O is a community led open source project. As such, the project depends on public contributions, bug-fixes, and documentation. Please see:

Build Status and CI

Build Status
Linux CPU Python 2 Status
Linux CPU Python 3 Status
Linux GPU Python 2 Status
Linux GPU Python 3 Status

Because of manylinux2010 requirement, TensorFlow I/O is built with Ubuntu:16.04 + Developer Toolset 7 (GCC 7.3) on Linux. Configuration with Ubuntu 16.04 with Developer Toolset 7 is not exactly straightforward. If the system have docker installed, then the following command will automatically build manylinux2010 compatible whl package:

#!/usr/bin/env bash

ls dist/*
for f in dist/*.whl; do
  docker run -i --rm -v $PWD:/v -w /v --net=host quay.io/pypa/manylinux2010_x86_64 bash -x -e /v/tools/build/auditwheel repair --plat manylinux2010_x86_64 $f
done
sudo chown -R $(id -nu):$(id -ng) .
ls wheelhouse/*

It takes some time to build, but once complete, there will be python 3.5, 3.6, 3.7 compatible whl packages available in wheelhouse directory.

On macOS, the same command could be used. However, the script expects python in shell and will only generate a whl package that matches the version of python in shell. If you want to build a whl package for a specific python then you have to alias this version of python to python in shell. See .github/workflows/build.yml Auditwheel step for instructions how to do that.

Note the above command is also the command we use when releasing packages for Linux and macOS.

TensorFlow I/O uses both GitHub Workflows and Google CI (Kokoro) for continuous integration. GitHub Workflows is used for macOS build and test. Kokoro is used for Linux build and test. Again, because of the manylinux2010 requirement, on Linux whl packages are always built with Ubuntu 16.04 + Developer Toolset 7. Tests are done on a variatiy of systems with different python3 versions to ensure a good coverage:

Python Ubuntu 18.04 Ubuntu 20.04 macOS + osx9 Windows-2019
2.7 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: N/A
3.7 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: :heavy_check_mark:
3.8 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: :heavy_check_mark:

TensorFlow I/O has integrations with many systems and cloud vendors such as Prometheus, Apache Kafka, Apache Ignite, Google Cloud PubSub, AWS Kinesis, Microsoft Azure Storage, Alibaba Cloud OSS etc.

We tried our best to test against those systems in our continuous integration whenever possible. Some tests such as Prometheus, Kafka, and Ignite are done with live systems, meaning we install Prometheus/Kafka/Ignite on CI machine before the test is run. Some tests such as Kinesis, PubSub, and Azure Storage are done through official or non-official emulators. Offline tests are also performed whenever possible, though systems covered through offine tests may not have the same level of coverage as live systems or emulators.

Live System Emulator CI Integration Offline
Apache Kafka :heavy_check_mark: :heavy_check_mark:
Apache Ignite :heavy_check_mark: :heavy_check_mark:
Prometheus :heavy_check_mark: :heavy_check_mark:
Google PubSub :heavy_check_mark: :heavy_check_mark:
Azure Storage :heavy_check_mark: :heavy_check_mark:
AWS Kinesis :heavy_check_mark: :heavy_check_mark:
Alibaba Cloud OSS :heavy_check_mark:
Google BigTable/BigQuery to be added
Elasticsearch (experimental) :heavy_check_mark: :heavy_check_mark:
MongoDB (experimental) :heavy_check_mark: :heavy_check_mark:

References for emulators:

Community

Additional Information

License

Apache License 2.0

Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

tensorflow_io_nightly-0.18.0.dev20210329194801-cp39-cp39-win_amd64.whl (21.3 MB view details)

Uploaded CPython 3.9 Windows x86-64

tensorflow_io_nightly-0.18.0.dev20210329194801-cp39-cp39-manylinux2010_x86_64.whl (25.4 MB view details)

Uploaded CPython 3.9 manylinux: glibc 2.12+ x86-64

tensorflow_io_nightly-0.18.0.dev20210329194801-cp39-cp39-macosx_10_14_x86_64.whl (21.5 MB view details)

Uploaded CPython 3.9 macOS 10.14+ x86-64

tensorflow_io_nightly-0.18.0.dev20210329194801-cp38-cp38-win_amd64.whl (21.3 MB view details)

Uploaded CPython 3.8 Windows x86-64

tensorflow_io_nightly-0.18.0.dev20210329194801-cp38-cp38-manylinux2010_x86_64.whl (25.4 MB view details)

Uploaded CPython 3.8 manylinux: glibc 2.12+ x86-64

tensorflow_io_nightly-0.18.0.dev20210329194801-cp38-cp38-macosx_10_14_x86_64.whl (21.5 MB view details)

Uploaded CPython 3.8 macOS 10.14+ x86-64

tensorflow_io_nightly-0.18.0.dev20210329194801-cp37-cp37m-win_amd64.whl (21.3 MB view details)

Uploaded CPython 3.7m Windows x86-64

tensorflow_io_nightly-0.18.0.dev20210329194801-cp37-cp37m-manylinux2010_x86_64.whl (25.4 MB view details)

Uploaded CPython 3.7m manylinux: glibc 2.12+ x86-64

tensorflow_io_nightly-0.18.0.dev20210329194801-cp37-cp37m-macosx_10_14_x86_64.whl (21.5 MB view details)

Uploaded CPython 3.7m macOS 10.14+ x86-64

tensorflow_io_nightly-0.18.0.dev20210329194801-cp36-cp36m-win_amd64.whl (21.3 MB view details)

Uploaded CPython 3.6m Windows x86-64

tensorflow_io_nightly-0.18.0.dev20210329194801-cp36-cp36m-manylinux2010_x86_64.whl (25.4 MB view details)

Uploaded CPython 3.6m manylinux: glibc 2.12+ x86-64

tensorflow_io_nightly-0.18.0.dev20210329194801-cp36-cp36m-macosx_10_14_x86_64.whl (21.5 MB view details)

Uploaded CPython 3.6m macOS 10.14+ x86-64

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210329194801-cp39-cp39-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210329194801-cp39-cp39-win_amd64.whl
Algorithm Hash digest
SHA256 ec6e9c1d88ae9bbc154179b7b0399158994d807bdee9bab8a50198a20e515d8a
MD5 d7130af24f6131a1cbe91365edbcd0a1
BLAKE2b-256 c1b6bbd0c7a051f3949e90c440d6741e9d5a89703a32a75d2818a45986cfcea8

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210329194801-cp39-cp39-manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210329194801-cp39-cp39-manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 d44c1059427bad4fa3ed7368c528690632dbde603ea3412d04181a89f3ec57ad
MD5 5296b00c4b60100e011246a0700f619d
BLAKE2b-256 510d70384d2927d826e350e8e730b02c62dc195c8825a5e23ec9b73d0879224c

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210329194801-cp39-cp39-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210329194801-cp39-cp39-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 361196bc799a53a4030b83bab5b1eaa0a719e2abd6dbe814e98bec97001d759c
MD5 9328cf9a54084ad30424c1757e40c528
BLAKE2b-256 6b08e5b15bbe9dd6a3b91562d27105bb32dd5f78e36f2787f447453c613d9f1a

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210329194801-cp38-cp38-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210329194801-cp38-cp38-win_amd64.whl
Algorithm Hash digest
SHA256 00b961387d5d5d523060d12c2b8d3889daaf7634422f88311098286f76920770
MD5 d89f2a5f2c5b11ac3042691b00e03457
BLAKE2b-256 8ccc796c72de6ae6556b80c02565fc618f3163d82c6e37cecd76e69228e5a48a

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210329194801-cp38-cp38-manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210329194801-cp38-cp38-manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 b4335502b8bc03a41337a9b408e19eb71f9237eada41552b5eb858a88b2cefff
MD5 d7e9d26d13d3d00c60b5b206c90eede8
BLAKE2b-256 8066f499495d84361073b56179974886626668fc2f13b261908b3655464ae9c4

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210329194801-cp38-cp38-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210329194801-cp38-cp38-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 7882b826f4e4344cdeba0821f178bf879839a7570539d5b10bbc6b6c215b5353
MD5 77f98dc5643f47698f0f4e2345e9f245
BLAKE2b-256 8c804398cfd20f12f3f005f95dc5dfe9223f430c56182774bc69630260c84cc0

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210329194801-cp37-cp37m-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210329194801-cp37-cp37m-win_amd64.whl
Algorithm Hash digest
SHA256 19f8a893b1c7898b6104e41d0f28f16e4c15d42260b795325193465b7b81041d
MD5 7a92e2c3d0ba32e8bfaf4e21d99fd4d1
BLAKE2b-256 4d6c38d475a750db0935a84f1f74376d25a5ce2597ecbcd1f943a02ea1ad67b9

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210329194801-cp37-cp37m-manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210329194801-cp37-cp37m-manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 5b89a6e8e9be9726d91273a2686bce651485c524fb64e674a207cd116c2742db
MD5 22d5f82760c1b5f682379c98d28bc520
BLAKE2b-256 429533670f3eb34fed94215b13c442651a910a6f8b53597cc6e53ae97d87d2eb

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210329194801-cp37-cp37m-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210329194801-cp37-cp37m-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 08b49140ac4fa0dbd17b29754fe6d439366ea87f3ef0f786f93790338050fd12
MD5 f8700745df1939b7b98d6307829ea605
BLAKE2b-256 4a4ee892e8a6425873ccfdaf71d6b75b775a9a622c6ad7cff0102af87d312d85

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210329194801-cp36-cp36m-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210329194801-cp36-cp36m-win_amd64.whl
Algorithm Hash digest
SHA256 bfd1723d4026ea76095479e4cd31e04f89605a8c4eeef34ce0f2bba5ec400fed
MD5 6499002c85b1425b69dd29b5ece02364
BLAKE2b-256 01129b770ba633c0165d4d2370b9df7ed5ec1559effedb89879a90fc013b6c91

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210329194801-cp36-cp36m-manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210329194801-cp36-cp36m-manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 1cec76d30eff9da5df2ba021574ee075c27bf41a20bac746ede95de49029b604
MD5 87932272534c9304efa34cc6ab00e5d8
BLAKE2b-256 68b110ceaf9469308621f8448d9e7685b931388ee327adbf343a309bf3da3c7c

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210329194801-cp36-cp36m-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210329194801-cp36-cp36m-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 65ea45c14bebce02c7652617655f61644f7d5a6134d63682930d3248838e38fd
MD5 d0b8098096fbf2aa3dbc8883c9c37049
BLAKE2b-256 19a07b3117558f56668397c1b100f9843f54a6c3d68a6ce661e673d2bf95c8f6

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page