Skip to main content

TensorFlow IO

Project description




TensorFlow I/O

GitHub CI PyPI License Documentation

TensorFlow I/O is a collection of file systems and file formats that are not available in TensorFlow's built-in support. A full list of supported file systems and file formats by TensorFlow I/O can be found here.

The use of tensorflow-io is straightforward with keras. Below is an example to Get Started with TensorFlow with the data processing aspect replaced by tensorflow-io:

import tensorflow as tf
import tensorflow_io as tfio

# Read the MNIST data into the IODataset.
dataset_url = "https://storage.googleapis.com/cvdf-datasets/mnist/"
d_train = tfio.IODataset.from_mnist(
    dataset_url + "train-images-idx3-ubyte.gz",
    dataset_url + "train-labels-idx1-ubyte.gz",
)

# Shuffle the elements of the dataset.
d_train = d_train.shuffle(buffer_size=1024)

# By default image data is uint8, so convert to float32 using map().
d_train = d_train.map(lambda x, y: (tf.image.convert_image_dtype(x, tf.float32), y))

# prepare batches the data just like any other tf.data.Dataset
d_train = d_train.batch(32)

# Build the model.
model = tf.keras.models.Sequential(
    [
        tf.keras.layers.Flatten(input_shape=(28, 28)),
        tf.keras.layers.Dense(512, activation=tf.nn.relu),
        tf.keras.layers.Dropout(0.2),
        tf.keras.layers.Dense(10, activation=tf.nn.softmax),
    ]
)

# Compile the model.
model.compile(
    optimizer="adam", loss="sparse_categorical_crossentropy", metrics=["accuracy"]
)

# Fit the model.
model.fit(d_train, epochs=5, steps_per_epoch=200)

In the above MNIST example, the URL's to access the dataset files are passed directly to the tfio.IODataset.from_mnist API call. This is due to the inherent support that tensorflow-io provides for HTTP/HTTPS file system, thus eliminating the need for downloading and saving datasets on a local directory.

NOTE: Since tensorflow-io is able to detect and uncompress the MNIST dataset automatically if needed, we can pass the URL's for the compressed files (gzip) to the API call as is.

Please check the official documentation for more detailed and interesting usages of the package.

Installation

Python Package

The tensorflow-io Python package can be installed with pip directly using:

$ pip install tensorflow-io

People who are a little more adventurous can also try our nightly binaries:

$ pip install tensorflow-io-nightly

Docker Images

In addition to the pip packages, the docker images can be used to quickly get started.

For stable builds:

$ docker pull tfsigio/tfio:latest
$ docker run -it --rm --name tfio-latest tfsigio/tfio:latest

For nightly builds:

$ docker pull tfsigio/tfio:nightly
$ docker run -it --rm --name tfio-nightly tfsigio/tfio:nightly

R Package

Once the tensorflow-io Python package has been successfully installed, you can install the development version of the R package from GitHub via the following:

if (!require("remotes")) install.packages("remotes")
remotes::install_github("tensorflow/io", subdir = "R-package")

TensorFlow Version Compatibility

To ensure compatibility with TensorFlow, it is recommended to install a matching version of TensorFlow I/O according to the table below. You can find the list of releases here.

TensorFlow I/O Version TensorFlow Compatibility Release Date
0.18.0 2.5.x May 13, 2021
0.17.1 2.4.x Apr 16, 2021
0.17.0 2.4.x Dec 14, 2020
0.16.0 2.3.x Oct 23, 2020
0.15.0 2.3.x Aug 03, 2020
0.14.0 2.2.x Jul 08, 2020
0.13.0 2.2.x May 10, 2020
0.12.0 2.1.x Feb 28, 2020
0.11.0 2.1.x Jan 10, 2020
0.10.0 2.0.x Dec 05, 2019
0.9.1 2.0.x Nov 15, 2019
0.9.0 2.0.x Oct 18, 2019
0.8.1 1.15.x Nov 15, 2019
0.8.0 1.15.x Oct 17, 2019
0.7.2 1.14.x Nov 15, 2019
0.7.1 1.14.x Oct 18, 2019
0.7.0 1.14.x Jul 14, 2019
0.6.0 1.13.x May 29, 2019
0.5.0 1.13.x Apr 12, 2019
0.4.0 1.13.x Mar 01, 2019
0.3.0 1.12.0 Feb 15, 2019
0.2.0 1.12.0 Jan 29, 2019
0.1.0 1.12.0 Dec 16, 2018

Performance Benchmarking

We use github-pages to document the results of API performance benchmarks. The benchmark job is triggered on every commit to master branch and facilitates tracking performance w.r.t commits.

Contributing

Tensorflow I/O is a community led open source project. As such, the project depends on public contributions, bug-fixes, and documentation. Please see:

Build Status and CI

Build Status
Linux CPU Python 2 Status
Linux CPU Python 3 Status
Linux GPU Python 2 Status
Linux GPU Python 3 Status

Because of manylinux2010 requirement, TensorFlow I/O is built with Ubuntu:16.04 + Developer Toolset 7 (GCC 7.3) on Linux. Configuration with Ubuntu 16.04 with Developer Toolset 7 is not exactly straightforward. If the system have docker installed, then the following command will automatically build manylinux2010 compatible whl package:

#!/usr/bin/env bash

ls dist/*
for f in dist/*.whl; do
  docker run -i --rm -v $PWD:/v -w /v --net=host quay.io/pypa/manylinux2010_x86_64 bash -x -e /v/tools/build/auditwheel repair --plat manylinux2010_x86_64 $f
done
sudo chown -R $(id -nu):$(id -ng) .
ls wheelhouse/*

It takes some time to build, but once complete, there will be python 3.5, 3.6, 3.7 compatible whl packages available in wheelhouse directory.

On macOS, the same command could be used. However, the script expects python in shell and will only generate a whl package that matches the version of python in shell. If you want to build a whl package for a specific python then you have to alias this version of python to python in shell. See .github/workflows/build.yml Auditwheel step for instructions how to do that.

Note the above command is also the command we use when releasing packages for Linux and macOS.

TensorFlow I/O uses both GitHub Workflows and Google CI (Kokoro) for continuous integration. GitHub Workflows is used for macOS build and test. Kokoro is used for Linux build and test. Again, because of the manylinux2010 requirement, on Linux whl packages are always built with Ubuntu 16.04 + Developer Toolset 7. Tests are done on a variatiy of systems with different python3 versions to ensure a good coverage:

Python Ubuntu 18.04 Ubuntu 20.04 macOS + osx9 Windows-2019
2.7 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: N/A
3.7 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: :heavy_check_mark:
3.8 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: :heavy_check_mark:

TensorFlow I/O has integrations with many systems and cloud vendors such as Prometheus, Apache Kafka, Apache Ignite, Google Cloud PubSub, AWS Kinesis, Microsoft Azure Storage, Alibaba Cloud OSS etc.

We tried our best to test against those systems in our continuous integration whenever possible. Some tests such as Prometheus, Kafka, and Ignite are done with live systems, meaning we install Prometheus/Kafka/Ignite on CI machine before the test is run. Some tests such as Kinesis, PubSub, and Azure Storage are done through official or non-official emulators. Offline tests are also performed whenever possible, though systems covered through offine tests may not have the same level of coverage as live systems or emulators.

Live System Emulator CI Integration Offline
Apache Kafka :heavy_check_mark: :heavy_check_mark:
Apache Ignite :heavy_check_mark: :heavy_check_mark:
Prometheus :heavy_check_mark: :heavy_check_mark:
Google PubSub :heavy_check_mark: :heavy_check_mark:
Azure Storage :heavy_check_mark: :heavy_check_mark:
AWS Kinesis :heavy_check_mark: :heavy_check_mark:
Alibaba Cloud OSS :heavy_check_mark:
Google BigTable/BigQuery to be added
Elasticsearch (experimental) :heavy_check_mark: :heavy_check_mark:
MongoDB (experimental) :heavy_check_mark: :heavy_check_mark:

References for emulators:

Community

Additional Information

License

Apache License 2.0

Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

tensorflow_io_nightly-0.18.0.dev20210519162459-cp39-cp39-win_amd64.whl (21.0 MB view details)

Uploaded CPython 3.9 Windows x86-64

tensorflow_io_nightly-0.18.0.dev20210519162459-cp39-cp39-macosx_10_14_x86_64.whl (22.7 MB view details)

Uploaded CPython 3.9 macOS 10.14+ x86-64

tensorflow_io_nightly-0.18.0.dev20210519162459-cp38-cp38-win_amd64.whl (21.0 MB view details)

Uploaded CPython 3.8 Windows x86-64

tensorflow_io_nightly-0.18.0.dev20210519162459-cp38-cp38-macosx_10_14_x86_64.whl (22.7 MB view details)

Uploaded CPython 3.8 macOS 10.14+ x86-64

tensorflow_io_nightly-0.18.0.dev20210519162459-cp37-cp37m-win_amd64.whl (21.0 MB view details)

Uploaded CPython 3.7m Windows x86-64

tensorflow_io_nightly-0.18.0.dev20210519162459-cp37-cp37m-macosx_10_14_x86_64.whl (22.7 MB view details)

Uploaded CPython 3.7m macOS 10.14+ x86-64

tensorflow_io_nightly-0.18.0.dev20210519162459-cp36-cp36m-win_amd64.whl (21.0 MB view details)

Uploaded CPython 3.6m Windows x86-64

tensorflow_io_nightly-0.18.0.dev20210519162459-cp36-cp36m-macosx_10_14_x86_64.whl (22.7 MB view details)

Uploaded CPython 3.6m macOS 10.14+ x86-64

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210519162459-cp39-cp39-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210519162459-cp39-cp39-win_amd64.whl
Algorithm Hash digest
SHA256 e6a225817ee1c15fdb9ec3b66e432260f4e6a41ce06e4e0e6b1837cad678703c
MD5 607e833ab5a7c2ce5b74ed22eb76b386
BLAKE2b-256 879a244eda94456d41b4a563b95dc06c7d7ebce8107c5aa295d518303e939cca

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210519162459-cp39-cp39-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210519162459-cp39-cp39-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 c138651390fa1ca0ebdae4b8d3db22c290150900b288238896887780e528d190
MD5 f052d110508060c6b97a8f3bbb6a156a
BLAKE2b-256 3094611b36cdac06c3dab38a4e717b849ea249e847dd576fa97c99ff9ff76f62

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210519162459-cp39-cp39-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210519162459-cp39-cp39-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 a277353384469cf26fd9ddae619caf88944e486c4bf33f4a464b45a198d425b3
MD5 e098c5eb1ce0402ff2dfbab158efab09
BLAKE2b-256 222098e160b3184922ea1d952c568616307593a174844ce0fae2e4af46da6628

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210519162459-cp38-cp38-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210519162459-cp38-cp38-win_amd64.whl
Algorithm Hash digest
SHA256 eeabf7ef1a7eb4c5e837e693113971877e2215b6fc4ef601fbf8ddd77618b32d
MD5 e3a071f21b59c6851630355fe15aa2a6
BLAKE2b-256 7cb14ec0ff7a81fec9cbd5a6b94984e85913d755492c6410cf9c09cc8a2146d0

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210519162459-cp38-cp38-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210519162459-cp38-cp38-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 a056b2e842d4b825a82be9640dffd019a8977ccd053970cbd3ea17eae818f647
MD5 0371ff22027c83958ea6e5d4012b509c
BLAKE2b-256 4408ee644a12e86cf75f108cb35aa7643120a5d9fbbcfaccfd43a51d37cb74b9

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210519162459-cp38-cp38-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210519162459-cp38-cp38-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 d5c241272b49cf3823a0616276f3f9def23979ec4ed570aa019d75790a1ad8ef
MD5 50205aee9d0b1bc867b94ccfec8168a0
BLAKE2b-256 fad95bf9e51d16080301d19b3fe9adda5ee2b46c8bd916ee8e90ea43dd1c90b0

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210519162459-cp37-cp37m-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210519162459-cp37-cp37m-win_amd64.whl
Algorithm Hash digest
SHA256 ed961a937b9cd462d3de733cc4305a97e2612fbf38f204a11999d7985f2982bc
MD5 37daddafea60cb197b5ebd57d6f1c1b2
BLAKE2b-256 bf57cd9a0d381028686b68cf7dd98c86fe4f6e4460dc076f7960ab009f6cad19

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210519162459-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210519162459-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 a82760e6c1eb57d2c118879d1ff988162b0c037877687c8989d04db25bd1dbdb
MD5 5a3263618a3a56634237a2ec8ba67cbd
BLAKE2b-256 6b6342b13729cdd19e06feecfd4005500f658d9865c36eb4e648ab291a3a0e53

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210519162459-cp37-cp37m-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210519162459-cp37-cp37m-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 8b96bca3b2604204671b8a94a0bc9ed36f970265a77557e5a63e1a18eb9397b3
MD5 a929c6b8c6b443af956148356f23d8c5
BLAKE2b-256 816648dfaa099e5d9bc9f93439c0ffc5668a7e94479e3c980421608c2729bb82

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210519162459-cp36-cp36m-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210519162459-cp36-cp36m-win_amd64.whl
Algorithm Hash digest
SHA256 ef3c89cede33b05ea374be9e913ae64bef0f710537a1795a834a911f67d50a30
MD5 738c02de3ccbfb2baba3fcbabd92146f
BLAKE2b-256 4ee28fc8e1bb08670593e917d2722a03fdcaff76152a20050ef10b20f966156e

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210519162459-cp36-cp36m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210519162459-cp36-cp36m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 7dfa883881cf382522081207cae6c02432861f158d41d8181c10703287e37d05
MD5 127ab903514e074e2e95e3b1f47cf6e3
BLAKE2b-256 b8fc9b2e4ca3dcdf60a15981ae9021d74e8bdf401a53d2a2750e3f950924c2e9

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210519162459-cp36-cp36m-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210519162459-cp36-cp36m-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 e832876c087d54e1d22d49e1003feca23e05a5f2d86c38bfe51ee49c3ba456c5
MD5 0fa71c1b72db154f7f620aca7bd540a3
BLAKE2b-256 74541b17e05e58d6b2d1c0ee814505bf1705e57a1797282d648255b4c3a2b584

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page