Skip to main content

TensorFlow IO

Project description




TensorFlow I/O

GitHub CI PyPI License Documentation

TensorFlow I/O is a collection of file systems and file formats that are not available in TensorFlow's built-in support. A full list of supported file systems and file formats by TensorFlow I/O can be found here.

The use of tensorflow-io is straightforward with keras. Below is an example to Get Started with TensorFlow with the data processing aspect replaced by tensorflow-io:

import tensorflow as tf
import tensorflow_io as tfio

# Read the MNIST data into the IODataset.
dataset_url = "https://storage.googleapis.com/cvdf-datasets/mnist/"
d_train = tfio.IODataset.from_mnist(
    dataset_url + "train-images-idx3-ubyte.gz",
    dataset_url + "train-labels-idx1-ubyte.gz",
)

# Shuffle the elements of the dataset.
d_train = d_train.shuffle(buffer_size=1024)

# By default image data is uint8, so convert to float32 using map().
d_train = d_train.map(lambda x, y: (tf.image.convert_image_dtype(x, tf.float32), y))

# prepare batches the data just like any other tf.data.Dataset
d_train = d_train.batch(32)

# Build the model.
model = tf.keras.models.Sequential(
    [
        tf.keras.layers.Flatten(input_shape=(28, 28)),
        tf.keras.layers.Dense(512, activation=tf.nn.relu),
        tf.keras.layers.Dropout(0.2),
        tf.keras.layers.Dense(10, activation=tf.nn.softmax),
    ]
)

# Compile the model.
model.compile(
    optimizer="adam", loss="sparse_categorical_crossentropy", metrics=["accuracy"]
)

# Fit the model.
model.fit(d_train, epochs=5, steps_per_epoch=200)

In the above MNIST example, the URL's to access the dataset files are passed directly to the tfio.IODataset.from_mnist API call. This is due to the inherent support that tensorflow-io provides for HTTP/HTTPS file system, thus eliminating the need for downloading and saving datasets on a local directory.

NOTE: Since tensorflow-io is able to detect and uncompress the MNIST dataset automatically if needed, we can pass the URL's for the compressed files (gzip) to the API call as is.

Please check the official documentation for more detailed and interesting usages of the package.

Installation

Python Package

The tensorflow-io Python package can be installed with pip directly using:

$ pip install tensorflow-io

People who are a little more adventurous can also try our nightly binaries:

$ pip install tensorflow-io-nightly

Docker Images

In addition to the pip packages, the docker images can be used to quickly get started.

For stable builds:

$ docker pull tfsigio/tfio:latest
$ docker run -it --rm --name tfio-latest tfsigio/tfio:latest

For nightly builds:

$ docker pull tfsigio/tfio:nightly
$ docker run -it --rm --name tfio-nightly tfsigio/tfio:nightly

R Package

Once the tensorflow-io Python package has been successfully installed, you can install the development version of the R package from GitHub via the following:

if (!require("remotes")) install.packages("remotes")
remotes::install_github("tensorflow/io", subdir = "R-package")

TensorFlow Version Compatibility

To ensure compatibility with TensorFlow, it is recommended to install a matching version of TensorFlow I/O according to the table below. You can find the list of releases here.

TensorFlow I/O Version TensorFlow Compatibility Release Date
0.17.0 2.4.x Dec 14, 2020
0.16.0 2.3.x Oct 23, 2020
0.15.0 2.3.x Aug 03, 2020
0.14.0 2.2.x Jul 08, 2020
0.13.0 2.2.x May 10, 2020
0.12.0 2.1.x Feb 28, 2020
0.11.0 2.1.x Jan 10, 2020
0.10.0 2.0.x Dec 05, 2019
0.9.1 2.0.x Nov 15, 2019
0.9.0 2.0.x Oct 18, 2019
0.8.1 1.15.x Nov 15, 2019
0.8.0 1.15.x Oct 17, 2019
0.7.2 1.14.x Nov 15, 2019
0.7.1 1.14.x Oct 18, 2019
0.7.0 1.14.x Jul 14, 2019
0.6.0 1.13.x May 29, 2019
0.5.0 1.13.x Apr 12, 2019
0.4.0 1.13.x Mar 01, 2019
0.3.0 1.12.0 Feb 15, 2019
0.2.0 1.12.0 Jan 29, 2019
0.1.0 1.12.0 Dec 16, 2018

Performance Benchmarking

We use github-pages to document the results of API performance benchmarks. The benchmark job is triggered on every commit to master branch and facilitates tracking performance w.r.t commits.

Contributing

Tensorflow I/O is a community led open source project. As such, the project depends on public contributions, bug-fixes, and documentation. Please see:

Build Status and CI

Build Status
Linux CPU Python 2 Status
Linux CPU Python 3 Status
Linux GPU Python 2 Status
Linux GPU Python 3 Status

Because of manylinux2010 requirement, TensorFlow I/O is built with Ubuntu:16.04 + Developer Toolset 7 (GCC 7.3) on Linux. Configuration with Ubuntu 16.04 with Developer Toolset 7 is not exactly straightforward. If the system have docker installed, then the following command will automatically build manylinux2010 compatible whl package:

#!/usr/bin/env bash

ls dist/*
for f in dist/*.whl; do
  docker run -i --rm -v $PWD:/v -w /v --net=host quay.io/pypa/manylinux2010_x86_64 bash -x -e /v/tools/build/auditwheel repair --plat manylinux2010_x86_64 $f
done
sudo chown -R $(id -nu):$(id -ng) .
ls wheelhouse/*

It takes some time to build, but once complete, there will be python 3.5, 3.6, 3.7 compatible whl packages available in wheelhouse directory.

On macOS, the same command could be used. However, the script expects python in shell and will only generate a whl package that matches the version of python in shell. If you want to build a whl package for a specific python then you have to alias this version of python to python in shell. See .github/workflows/build.yml Auditwheel step for instructions how to do that.

Note the above command is also the command we use when releasing packages for Linux and macOS.

TensorFlow I/O uses both GitHub Workflows and Google CI (Kokoro) for continuous integration. GitHub Workflows is used for macOS build and test. Kokoro is used for Linux build and test. Again, because of the manylinux2010 requirement, on Linux whl packages are always built with Ubuntu 16.04 + Developer Toolset 7. Tests are done on a variatiy of systems with different python3 versions to ensure a good coverage:

Python Ubuntu 18.04 Ubuntu 20.04 macOS + osx9 Windows-2019
2.7 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: N/A
3.7 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: :heavy_check_mark:
3.8 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: :heavy_check_mark:

TensorFlow I/O has integrations with many systems and cloud vendors such as Prometheus, Apache Kafka, Apache Ignite, Google Cloud PubSub, AWS Kinesis, Microsoft Azure Storage, Alibaba Cloud OSS etc.

We tried our best to test against those systems in our continuous integration whenever possible. Some tests such as Prometheus, Kafka, and Ignite are done with live systems, meaning we install Prometheus/Kafka/Ignite on CI machine before the test is run. Some tests such as Kinesis, PubSub, and Azure Storage are done through official or non-official emulators. Offline tests are also performed whenever possible, though systems covered through offine tests may not have the same level of coverage as live systems or emulators.

Live System Emulator CI Integration Offline
Apache Kafka :heavy_check_mark: :heavy_check_mark:
Apache Ignite :heavy_check_mark: :heavy_check_mark:
Prometheus :heavy_check_mark: :heavy_check_mark:
Google PubSub :heavy_check_mark: :heavy_check_mark:
Azure Storage :heavy_check_mark: :heavy_check_mark:
AWS Kinesis :heavy_check_mark: :heavy_check_mark:
Alibaba Cloud OSS :heavy_check_mark:
Google BigTable/BigQuery to be added
Elasticsearch (experimental) :heavy_check_mark: :heavy_check_mark:
MongoDB (experimental) :heavy_check_mark: :heavy_check_mark:

References for emulators:

Community

Additional Information

License

Apache License 2.0

Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

tensorflow_io_nightly-0.18.0.dev20210413234936-cp39-cp39-win_amd64.whl (21.3 MB view details)

Uploaded CPython 3.9 Windows x86-64

tensorflow_io_nightly-0.18.0.dev20210413234936-cp39-cp39-manylinux2010_x86_64.whl (24.6 MB view details)

Uploaded CPython 3.9 manylinux: glibc 2.12+ x86-64

tensorflow_io_nightly-0.18.0.dev20210413234936-cp39-cp39-macosx_10_14_x86_64.whl (21.3 MB view details)

Uploaded CPython 3.9 macOS 10.14+ x86-64

tensorflow_io_nightly-0.18.0.dev20210413234936-cp38-cp38-win_amd64.whl (21.3 MB view details)

Uploaded CPython 3.8 Windows x86-64

tensorflow_io_nightly-0.18.0.dev20210413234936-cp38-cp38-manylinux2010_x86_64.whl (24.6 MB view details)

Uploaded CPython 3.8 manylinux: glibc 2.12+ x86-64

tensorflow_io_nightly-0.18.0.dev20210413234936-cp38-cp38-macosx_10_14_x86_64.whl (21.3 MB view details)

Uploaded CPython 3.8 macOS 10.14+ x86-64

tensorflow_io_nightly-0.18.0.dev20210413234936-cp37-cp37m-win_amd64.whl (21.3 MB view details)

Uploaded CPython 3.7m Windows x86-64

tensorflow_io_nightly-0.18.0.dev20210413234936-cp37-cp37m-manylinux2010_x86_64.whl (24.6 MB view details)

Uploaded CPython 3.7m manylinux: glibc 2.12+ x86-64

tensorflow_io_nightly-0.18.0.dev20210413234936-cp37-cp37m-macosx_10_14_x86_64.whl (21.3 MB view details)

Uploaded CPython 3.7m macOS 10.14+ x86-64

tensorflow_io_nightly-0.18.0.dev20210413234936-cp36-cp36m-win_amd64.whl (21.3 MB view details)

Uploaded CPython 3.6m Windows x86-64

tensorflow_io_nightly-0.18.0.dev20210413234936-cp36-cp36m-manylinux2010_x86_64.whl (24.6 MB view details)

Uploaded CPython 3.6m manylinux: glibc 2.12+ x86-64

tensorflow_io_nightly-0.18.0.dev20210413234936-cp36-cp36m-macosx_10_14_x86_64.whl (21.3 MB view details)

Uploaded CPython 3.6m macOS 10.14+ x86-64

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210413234936-cp39-cp39-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210413234936-cp39-cp39-win_amd64.whl
Algorithm Hash digest
SHA256 5df08dc74a1a398008b129c8c61afb06e2a1f3942605c073d983d48dabe835b2
MD5 36024e261dc51c70e8215686ecd9322d
BLAKE2b-256 03790f4551ebed103a6b9b79a717ffe074f4cb2dccea4fa52f8ade8e995a86c0

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210413234936-cp39-cp39-manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210413234936-cp39-cp39-manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 092df10922e846709484de3a8b09110fb2f3d3fee61fcb3131c0752dbb9f3af6
MD5 0a907106f36bd851de00da66200a400c
BLAKE2b-256 f4f3ec22874b9ab0d23ff7120deaa9048baf3a9de9129fecbf181f38e81cc3fe

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210413234936-cp39-cp39-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210413234936-cp39-cp39-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 ab07885cc8d8c27ff0f32e68247af117629381f14fdaa396b0bdbb162fa13c25
MD5 14511e70f556962e7a30fe727c44ed8c
BLAKE2b-256 cb3deef3a60877baf2482cd3914f105533b04c8831baae2803da126241226a63

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210413234936-cp38-cp38-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210413234936-cp38-cp38-win_amd64.whl
Algorithm Hash digest
SHA256 84171a612efb99a5cc22a0e901a2c0a7d4519c30c9ecb94f2db4a26b7b696c78
MD5 5786259e7ecccf66cb621235a740d0d2
BLAKE2b-256 fa625a47a16fbb829e138a83f97b55fe236404a015f497d3eb9a99f7eb6dc4d8

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210413234936-cp38-cp38-manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210413234936-cp38-cp38-manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 1be39f3b864338e3943692637b46510c2edda0e4f00e88063313efeaf011c39e
MD5 93c72260e90bef24b0520365c906899d
BLAKE2b-256 b1cb0811760a294bab779a44a173a6ccc194e4722a687e8ae37d6cc2da3070b0

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210413234936-cp38-cp38-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210413234936-cp38-cp38-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 634c439633b3cd24964aaf7ab5ae5e81aabb0c10942fcd26f2a063cde0f50c18
MD5 db6ea31182a5e02d5949ebc4a756e23c
BLAKE2b-256 033f4a72b6543ee6c7e27d23c7599e16361a7e0de29096c7ba7c49cfc9a4affd

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210413234936-cp37-cp37m-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210413234936-cp37-cp37m-win_amd64.whl
Algorithm Hash digest
SHA256 41e8537a85d4c625cd81cc116f66356073d3e9bea14b4a3932451569b1412ee9
MD5 1f0029c2c9d0407caeeeef03e549efb1
BLAKE2b-256 973d1a03da2ce5e2aa8a94a706581275e28fa1f97c5e486adb056669fbf9de46

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210413234936-cp37-cp37m-manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210413234936-cp37-cp37m-manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 db6399f86bf9cdba801885b21a3197b7623a00cd78e0d211d4ed07b1589a46a3
MD5 ea10a83905fcc8ff66d79c38019a2c03
BLAKE2b-256 a6991013949d57a9f6b6770c4a283698d6ef3e7dbbd8bb4addcfb0528a986f41

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210413234936-cp37-cp37m-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210413234936-cp37-cp37m-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 881e9da5a91c388056b743243c7147e360890f1b0521b1ea8f9bb458ad262f58
MD5 18f92600c8d0b3e20582e9a0e27767d0
BLAKE2b-256 17856c2e82cd137d49efa8bcde06f0f080a919ce55b395b4846b6c64eb5822e3

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210413234936-cp36-cp36m-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210413234936-cp36-cp36m-win_amd64.whl
Algorithm Hash digest
SHA256 7d5d3d4b9192675ce2ee48d434500ba8a86cb6be6b972351ab77180815bb6357
MD5 9507b86390077131f31cfe69b64771ae
BLAKE2b-256 7f1391767453e0632c5595baf54e6f20a2886cb5dc6bbf8697f9ca98f872f194

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210413234936-cp36-cp36m-manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210413234936-cp36-cp36m-manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 ae361307a6857401c89ac284edf27ac515d3e38f3c5f77eedff2db9d2d858a41
MD5 56cd10b295d9e8f08fb82785d180d00f
BLAKE2b-256 05496761d71088d40df23c43f75b7b608c9b915330d5395f9d85deeb141567f8

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210413234936-cp36-cp36m-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210413234936-cp36-cp36m-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 a70a0bbba8f39318d73f5b411037a2ad59453b2bec9780aa3ca8e8f6952433d2
MD5 3aa0b57e83cc6a6f54854f102586092b
BLAKE2b-256 8845e050253a44b2394776efbcff9a6dcb97dee83c58196bf5b26612e94d1710

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page