Skip to main content

TensorFlow IO

Project description




TensorFlow I/O

GitHub CI PyPI License Documentation

TensorFlow I/O is a collection of file systems and file formats that are not available in TensorFlow's built-in support. A full list of supported file systems and file formats by TensorFlow I/O can be found here.

The use of tensorflow-io is straightforward with keras. Below is an example to Get Started with TensorFlow with the data processing aspect replaced by tensorflow-io:

import tensorflow as tf
import tensorflow_io as tfio

# Read the MNIST data into the IODataset.
dataset_url = "https://storage.googleapis.com/cvdf-datasets/mnist/"
d_train = tfio.IODataset.from_mnist(
    dataset_url + "train-images-idx3-ubyte.gz",
    dataset_url + "train-labels-idx1-ubyte.gz",
)

# Shuffle the elements of the dataset.
d_train = d_train.shuffle(buffer_size=1024)

# By default image data is uint8, so convert to float32 using map().
d_train = d_train.map(lambda x, y: (tf.image.convert_image_dtype(x, tf.float32), y))

# prepare batches the data just like any other tf.data.Dataset
d_train = d_train.batch(32)

# Build the model.
model = tf.keras.models.Sequential(
    [
        tf.keras.layers.Flatten(input_shape=(28, 28)),
        tf.keras.layers.Dense(512, activation=tf.nn.relu),
        tf.keras.layers.Dropout(0.2),
        tf.keras.layers.Dense(10, activation=tf.nn.softmax),
    ]
)

# Compile the model.
model.compile(
    optimizer="adam", loss="sparse_categorical_crossentropy", metrics=["accuracy"]
)

# Fit the model.
model.fit(d_train, epochs=5, steps_per_epoch=200)

In the above MNIST example, the URL's to access the dataset files are passed directly to the tfio.IODataset.from_mnist API call. This is due to the inherent support that tensorflow-io provides for HTTP/HTTPS file system, thus eliminating the need for downloading and saving datasets on a local directory.

NOTE: Since tensorflow-io is able to detect and uncompress the MNIST dataset automatically if needed, we can pass the URL's for the compressed files (gzip) to the API call as is.

Please check the official documentation for more detailed and interesting usages of the package.

Installation

Python Package

The tensorflow-io Python package can be installed with pip directly using:

$ pip install tensorflow-io

People who are a little more adventurous can also try our nightly binaries:

$ pip install tensorflow-io-nightly

Docker Images

In addition to the pip packages, the docker images can be used to quickly get started.

For stable builds:

$ docker pull tfsigio/tfio:latest
$ docker run -it --rm --name tfio-latest tfsigio/tfio:latest

For nightly builds:

$ docker pull tfsigio/tfio:nightly
$ docker run -it --rm --name tfio-nightly tfsigio/tfio:nightly

R Package

Once the tensorflow-io Python package has been successfully installed, you can install the development version of the R package from GitHub via the following:

if (!require("remotes")) install.packages("remotes")
remotes::install_github("tensorflow/io", subdir = "R-package")

TensorFlow Version Compatibility

To ensure compatibility with TensorFlow, it is recommended to install a matching version of TensorFlow I/O according to the table below. You can find the list of releases here.

TensorFlow I/O Version TensorFlow Compatibility Release Date
0.18.0 2.5.x May 13, 2021
0.17.1 2.4.x Apr 16, 2021
0.17.0 2.4.x Dec 14, 2020
0.16.0 2.3.x Oct 23, 2020
0.15.0 2.3.x Aug 03, 2020
0.14.0 2.2.x Jul 08, 2020
0.13.0 2.2.x May 10, 2020
0.12.0 2.1.x Feb 28, 2020
0.11.0 2.1.x Jan 10, 2020
0.10.0 2.0.x Dec 05, 2019
0.9.1 2.0.x Nov 15, 2019
0.9.0 2.0.x Oct 18, 2019
0.8.1 1.15.x Nov 15, 2019
0.8.0 1.15.x Oct 17, 2019
0.7.2 1.14.x Nov 15, 2019
0.7.1 1.14.x Oct 18, 2019
0.7.0 1.14.x Jul 14, 2019
0.6.0 1.13.x May 29, 2019
0.5.0 1.13.x Apr 12, 2019
0.4.0 1.13.x Mar 01, 2019
0.3.0 1.12.0 Feb 15, 2019
0.2.0 1.12.0 Jan 29, 2019
0.1.0 1.12.0 Dec 16, 2018

Performance Benchmarking

We use github-pages to document the results of API performance benchmarks. The benchmark job is triggered on every commit to master branch and facilitates tracking performance w.r.t commits.

Contributing

Tensorflow I/O is a community led open source project. As such, the project depends on public contributions, bug-fixes, and documentation. Please see:

Build Status and CI

Build Status
Linux CPU Python 2 Status
Linux CPU Python 3 Status
Linux GPU Python 2 Status
Linux GPU Python 3 Status

Because of manylinux2010 requirement, TensorFlow I/O is built with Ubuntu:16.04 + Developer Toolset 7 (GCC 7.3) on Linux. Configuration with Ubuntu 16.04 with Developer Toolset 7 is not exactly straightforward. If the system have docker installed, then the following command will automatically build manylinux2010 compatible whl package:

#!/usr/bin/env bash

ls dist/*
for f in dist/*.whl; do
  docker run -i --rm -v $PWD:/v -w /v --net=host quay.io/pypa/manylinux2010_x86_64 bash -x -e /v/tools/build/auditwheel repair --plat manylinux2010_x86_64 $f
done
sudo chown -R $(id -nu):$(id -ng) .
ls wheelhouse/*

It takes some time to build, but once complete, there will be python 3.5, 3.6, 3.7 compatible whl packages available in wheelhouse directory.

On macOS, the same command could be used. However, the script expects python in shell and will only generate a whl package that matches the version of python in shell. If you want to build a whl package for a specific python then you have to alias this version of python to python in shell. See .github/workflows/build.yml Auditwheel step for instructions how to do that.

Note the above command is also the command we use when releasing packages for Linux and macOS.

TensorFlow I/O uses both GitHub Workflows and Google CI (Kokoro) for continuous integration. GitHub Workflows is used for macOS build and test. Kokoro is used for Linux build and test. Again, because of the manylinux2010 requirement, on Linux whl packages are always built with Ubuntu 16.04 + Developer Toolset 7. Tests are done on a variatiy of systems with different python3 versions to ensure a good coverage:

Python Ubuntu 18.04 Ubuntu 20.04 macOS + osx9 Windows-2019
2.7 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: N/A
3.7 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: :heavy_check_mark:
3.8 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: :heavy_check_mark:

TensorFlow I/O has integrations with many systems and cloud vendors such as Prometheus, Apache Kafka, Apache Ignite, Google Cloud PubSub, AWS Kinesis, Microsoft Azure Storage, Alibaba Cloud OSS etc.

We tried our best to test against those systems in our continuous integration whenever possible. Some tests such as Prometheus, Kafka, and Ignite are done with live systems, meaning we install Prometheus/Kafka/Ignite on CI machine before the test is run. Some tests such as Kinesis, PubSub, and Azure Storage are done through official or non-official emulators. Offline tests are also performed whenever possible, though systems covered through offine tests may not have the same level of coverage as live systems or emulators.

Live System Emulator CI Integration Offline
Apache Kafka :heavy_check_mark: :heavy_check_mark:
Apache Ignite :heavy_check_mark: :heavy_check_mark:
Prometheus :heavy_check_mark: :heavy_check_mark:
Google PubSub :heavy_check_mark: :heavy_check_mark:
Azure Storage :heavy_check_mark: :heavy_check_mark:
AWS Kinesis :heavy_check_mark: :heavy_check_mark:
Alibaba Cloud OSS :heavy_check_mark:
Google BigTable/BigQuery to be added
Elasticsearch (experimental) :heavy_check_mark: :heavy_check_mark:
MongoDB (experimental) :heavy_check_mark: :heavy_check_mark:

References for emulators:

Community

Additional Information

License

Apache License 2.0

Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

tensorflow_io_nightly-0.18.0.dev20210619183659-cp39-cp39-win_amd64.whl (21.2 MB view details)

Uploaded CPython 3.9 Windows x86-64

tensorflow_io_nightly-0.18.0.dev20210619183659-cp39-cp39-macosx_10_14_x86_64.whl (22.8 MB view details)

Uploaded CPython 3.9 macOS 10.14+ x86-64

tensorflow_io_nightly-0.18.0.dev20210619183659-cp38-cp38-win_amd64.whl (21.2 MB view details)

Uploaded CPython 3.8 Windows x86-64

tensorflow_io_nightly-0.18.0.dev20210619183659-cp38-cp38-macosx_10_14_x86_64.whl (22.8 MB view details)

Uploaded CPython 3.8 macOS 10.14+ x86-64

tensorflow_io_nightly-0.18.0.dev20210619183659-cp37-cp37m-win_amd64.whl (21.2 MB view details)

Uploaded CPython 3.7m Windows x86-64

tensorflow_io_nightly-0.18.0.dev20210619183659-cp37-cp37m-macosx_10_14_x86_64.whl (22.8 MB view details)

Uploaded CPython 3.7m macOS 10.14+ x86-64

tensorflow_io_nightly-0.18.0.dev20210619183659-cp36-cp36m-win_amd64.whl (21.2 MB view details)

Uploaded CPython 3.6m Windows x86-64

tensorflow_io_nightly-0.18.0.dev20210619183659-cp36-cp36m-macosx_10_14_x86_64.whl (22.8 MB view details)

Uploaded CPython 3.6m macOS 10.14+ x86-64

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210619183659-cp39-cp39-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210619183659-cp39-cp39-win_amd64.whl
Algorithm Hash digest
SHA256 f282b399c5b5db4260f2be11371c6fa527f4e17f39f92613b8c6e25a9c91ba40
MD5 2321a5036fa84fde462d2cf3be60f4dc
BLAKE2b-256 3ce7d9ddad3b8433e6fc96d4a98645352b1b5b77a5c954a4a46f6eff053836cc

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210619183659-cp39-cp39-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210619183659-cp39-cp39-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 8f6027330a076b95d05d2e744dd41748b561f514b28d98aeeccbffa64baf0431
MD5 128f10d39837358ccc6c5111897e35ab
BLAKE2b-256 766c8062ae10d8d6cef1ed7651974453c0c178b859bbd673d9487f658f63a7ea

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210619183659-cp39-cp39-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210619183659-cp39-cp39-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 8aa6f31c68287c8bf20c580275269e4b2778f4f961312328786e2b798fba5b06
MD5 10a28d3aed020eb19e0f278bdbb7533d
BLAKE2b-256 be6e885cec96daad44a75504ecbf1202959214e084cc9e29a4cbe2ffca60ea83

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210619183659-cp38-cp38-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210619183659-cp38-cp38-win_amd64.whl
Algorithm Hash digest
SHA256 2c7f8a1ad61f0f12441f34812ecaea792db02e197d191f30955ef6c09fb16c8a
MD5 976151dd32c75ea911639b59b169e9a6
BLAKE2b-256 a8e78f20fcfe17cc2dfb6175623b0858401406a324d28d667977e90b891dfb78

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210619183659-cp38-cp38-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210619183659-cp38-cp38-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 2f83ecd112d4d2cdc2376a39375d3620aeaabd18907c71a259b1c8d508406111
MD5 27c84e1d31157c9fd3a6a1010c606477
BLAKE2b-256 c06ea952e53db61ff4730a769b8d9fb7de051ccdf3c128b339ce1436f90431f3

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210619183659-cp38-cp38-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210619183659-cp38-cp38-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 98f39b758a9dfc48c0234969b49fecc22f7353d07f9672e3513f1457d0fa7fc5
MD5 46f58ab0efed374e2018c0bf3bcd847b
BLAKE2b-256 11dd3ed4bf0ab3945e0bbb99fbdd10d8869bae9b667ce93b45eb276427e9480d

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210619183659-cp37-cp37m-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210619183659-cp37-cp37m-win_amd64.whl
Algorithm Hash digest
SHA256 c42f99db6722301186d6f6dd9209ae2eaf65ca27c98987d8423e1fdb11461391
MD5 900d304a130c1bdd005f4c5c62a5c16e
BLAKE2b-256 d90b7bc92f58c9a623f2fe6ac9df429ebf6ccb489e49fcd08c542702943f9f04

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210619183659-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210619183659-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 27bad158eb604261bffee2fae1bab411283da0497a7d949e18ad2f533e51024b
MD5 c3578214216ed0a8dbdc2ca43e0b7ec7
BLAKE2b-256 7ea322a11af8c4abc700860bba8abee63f84e62bbfe223bcc26b75f0a30aa683

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210619183659-cp37-cp37m-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210619183659-cp37-cp37m-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 3b163f94de43a7f7d13487fac13184c4e01c90613031bd1db84556b9aace1d2f
MD5 1d44959559108aa198b74b96c5867b92
BLAKE2b-256 6b5ed4383c9c89e226210cf6c891941efe3464fc2a1c6b8bb3f39be26b0c6bf8

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210619183659-cp36-cp36m-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210619183659-cp36-cp36m-win_amd64.whl
Algorithm Hash digest
SHA256 e1a997d971e8b1b841559d0b0cad80d95e8fbbfbe3a42f9b02a4129cb95fc67d
MD5 2644c3763cb9b6b1c03e0ae5a3956943
BLAKE2b-256 988f7a7acb1d22682c055c99a5f0e7aa32bcef2653f4954e75699a63e66e18f8

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210619183659-cp36-cp36m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210619183659-cp36-cp36m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 9ddb3ea200644a7112e6b8e959e2c14891c98a01490fd95d95ae13b279479ea5
MD5 4f43918a56299529aa44da4091ad4253
BLAKE2b-256 f8a1edcb4f3bc7622ffbecc451a8ff9f5b6b97892453f28d030561de9f9680d4

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210619183659-cp36-cp36m-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210619183659-cp36-cp36m-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 e0db23c513c3591b3af7c2a122f9a4d3ff85ca6a7215c2e4531bda70def9022f
MD5 dc7369af5c897165a4637e680db4def7
BLAKE2b-256 729b470bbda65117481eddfc78466f843f021fe3d44021fbb0e3494240c7aa82

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page