Skip to main content

TensorFlow IO

Project description




TensorFlow I/O

GitHub CI PyPI License Documentation

TensorFlow I/O is a collection of file systems and file formats that are not available in TensorFlow's built-in support. A full list of supported file systems and file formats by TensorFlow I/O can be found here.

The use of tensorflow-io is straightforward with keras. Below is an example to Get Started with TensorFlow with the data processing aspect replaced by tensorflow-io:

import tensorflow as tf
import tensorflow_io as tfio

# Read the MNIST data into the IODataset.
dataset_url = "https://storage.googleapis.com/cvdf-datasets/mnist/"
d_train = tfio.IODataset.from_mnist(
    dataset_url + "train-images-idx3-ubyte.gz",
    dataset_url + "train-labels-idx1-ubyte.gz",
)

# Shuffle the elements of the dataset.
d_train = d_train.shuffle(buffer_size=1024)

# By default image data is uint8, so convert to float32 using map().
d_train = d_train.map(lambda x, y: (tf.image.convert_image_dtype(x, tf.float32), y))

# prepare batches the data just like any other tf.data.Dataset
d_train = d_train.batch(32)

# Build the model.
model = tf.keras.models.Sequential(
    [
        tf.keras.layers.Flatten(input_shape=(28, 28)),
        tf.keras.layers.Dense(512, activation=tf.nn.relu),
        tf.keras.layers.Dropout(0.2),
        tf.keras.layers.Dense(10, activation=tf.nn.softmax),
    ]
)

# Compile the model.
model.compile(
    optimizer="adam", loss="sparse_categorical_crossentropy", metrics=["accuracy"]
)

# Fit the model.
model.fit(d_train, epochs=5, steps_per_epoch=200)

In the above MNIST example, the URL's to access the dataset files are passed directly to the tfio.IODataset.from_mnist API call. This is due to the inherent support that tensorflow-io provides for HTTP/HTTPS file system, thus eliminating the need for downloading and saving datasets on a local directory.

NOTE: Since tensorflow-io is able to detect and uncompress the MNIST dataset automatically if needed, we can pass the URL's for the compressed files (gzip) to the API call as is.

Please check the official documentation for more detailed and interesting usages of the package.

Installation

Python Package

The tensorflow-io Python package can be installed with pip directly using:

$ pip install tensorflow-io

People who are a little more adventurous can also try our nightly binaries:

$ pip install tensorflow-io-nightly

Docker Images

In addition to the pip packages, the docker images can be used to quickly get started.

For stable builds:

$ docker pull tfsigio/tfio:latest
$ docker run -it --rm --name tfio-latest tfsigio/tfio:latest

For nightly builds:

$ docker pull tfsigio/tfio:nightly
$ docker run -it --rm --name tfio-nightly tfsigio/tfio:nightly

R Package

Once the tensorflow-io Python package has been successfully installed, you can install the development version of the R package from GitHub via the following:

if (!require("remotes")) install.packages("remotes")
remotes::install_github("tensorflow/io", subdir = "R-package")

TensorFlow Version Compatibility

To ensure compatibility with TensorFlow, it is recommended to install a matching version of TensorFlow I/O according to the table below. You can find the list of releases here.

TensorFlow I/O Version TensorFlow Compatibility Release Date
0.20.0 2.6.x Aug 11, 2021
0.19.1 2.5.x Jul 25, 2021
0.19.0 2.5.x Jun 25, 2021
0.18.0 2.5.x May 13, 2021
0.17.1 2.4.x Apr 16, 2021
0.17.0 2.4.x Dec 14, 2020
0.16.0 2.3.x Oct 23, 2020
0.15.0 2.3.x Aug 03, 2020
0.14.0 2.2.x Jul 08, 2020
0.13.0 2.2.x May 10, 2020
0.12.0 2.1.x Feb 28, 2020
0.11.0 2.1.x Jan 10, 2020
0.10.0 2.0.x Dec 05, 2019
0.9.1 2.0.x Nov 15, 2019
0.9.0 2.0.x Oct 18, 2019
0.8.1 1.15.x Nov 15, 2019
0.8.0 1.15.x Oct 17, 2019
0.7.2 1.14.x Nov 15, 2019
0.7.1 1.14.x Oct 18, 2019
0.7.0 1.14.x Jul 14, 2019
0.6.0 1.13.x May 29, 2019
0.5.0 1.13.x Apr 12, 2019
0.4.0 1.13.x Mar 01, 2019
0.3.0 1.12.0 Feb 15, 2019
0.2.0 1.12.0 Jan 29, 2019
0.1.0 1.12.0 Dec 16, 2018

Performance Benchmarking

We use github-pages to document the results of API performance benchmarks. The benchmark job is triggered on every commit to master branch and facilitates tracking performance w.r.t commits.

Contributing

Tensorflow I/O is a community led open source project. As such, the project depends on public contributions, bug-fixes, and documentation. Please see:

Build Status and CI

Build Status
Linux CPU Python 2 Status
Linux CPU Python 3 Status
Linux GPU Python 2 Status
Linux GPU Python 3 Status

Because of manylinux2010 requirement, TensorFlow I/O is built with Ubuntu:16.04 + Developer Toolset 7 (GCC 7.3) on Linux. Configuration with Ubuntu 16.04 with Developer Toolset 7 is not exactly straightforward. If the system have docker installed, then the following command will automatically build manylinux2010 compatible whl package:

#!/usr/bin/env bash

ls dist/*
for f in dist/*.whl; do
  docker run -i --rm -v $PWD:/v -w /v --net=host quay.io/pypa/manylinux2010_x86_64 bash -x -e /v/tools/build/auditwheel repair --plat manylinux2010_x86_64 $f
done
sudo chown -R $(id -nu):$(id -ng) .
ls wheelhouse/*

It takes some time to build, but once complete, there will be python 3.5, 3.6, 3.7 compatible whl packages available in wheelhouse directory.

On macOS, the same command could be used. However, the script expects python in shell and will only generate a whl package that matches the version of python in shell. If you want to build a whl package for a specific python then you have to alias this version of python to python in shell. See .github/workflows/build.yml Auditwheel step for instructions how to do that.

Note the above command is also the command we use when releasing packages for Linux and macOS.

TensorFlow I/O uses both GitHub Workflows and Google CI (Kokoro) for continuous integration. GitHub Workflows is used for macOS build and test. Kokoro is used for Linux build and test. Again, because of the manylinux2010 requirement, on Linux whl packages are always built with Ubuntu 16.04 + Developer Toolset 7. Tests are done on a variatiy of systems with different python3 versions to ensure a good coverage:

Python Ubuntu 18.04 Ubuntu 20.04 macOS + osx9 Windows-2019
2.7 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: N/A
3.7 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: :heavy_check_mark:
3.8 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: :heavy_check_mark:

TensorFlow I/O has integrations with many systems and cloud vendors such as Prometheus, Apache Kafka, Apache Ignite, Google Cloud PubSub, AWS Kinesis, Microsoft Azure Storage, Alibaba Cloud OSS etc.

We tried our best to test against those systems in our continuous integration whenever possible. Some tests such as Prometheus, Kafka, and Ignite are done with live systems, meaning we install Prometheus/Kafka/Ignite on CI machine before the test is run. Some tests such as Kinesis, PubSub, and Azure Storage are done through official or non-official emulators. Offline tests are also performed whenever possible, though systems covered through offine tests may not have the same level of coverage as live systems or emulators.

Live System Emulator CI Integration Offline
Apache Kafka :heavy_check_mark: :heavy_check_mark:
Apache Ignite :heavy_check_mark: :heavy_check_mark:
Prometheus :heavy_check_mark: :heavy_check_mark:
Google PubSub :heavy_check_mark: :heavy_check_mark:
Azure Storage :heavy_check_mark: :heavy_check_mark:
AWS Kinesis :heavy_check_mark: :heavy_check_mark:
Alibaba Cloud OSS :heavy_check_mark:
Google BigTable/BigQuery to be added
Elasticsearch (experimental) :heavy_check_mark: :heavy_check_mark:
MongoDB (experimental) :heavy_check_mark: :heavy_check_mark:

References for emulators:

Community

Additional Information

License

Apache License 2.0

Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

tensorflow_io_nightly-0.20.0.dev20210902211216-cp39-cp39-win_amd64.whl (21.4 MB view details)

Uploaded CPython 3.9 Windows x86-64

tensorflow_io_nightly-0.20.0.dev20210902211216-cp39-cp39-macosx_10_14_x86_64.whl (22.8 MB view details)

Uploaded CPython 3.9 macOS 10.14+ x86-64

tensorflow_io_nightly-0.20.0.dev20210902211216-cp38-cp38-win_amd64.whl (21.4 MB view details)

Uploaded CPython 3.8 Windows x86-64

tensorflow_io_nightly-0.20.0.dev20210902211216-cp38-cp38-macosx_10_14_x86_64.whl (22.8 MB view details)

Uploaded CPython 3.8 macOS 10.14+ x86-64

tensorflow_io_nightly-0.20.0.dev20210902211216-cp37-cp37m-win_amd64.whl (21.4 MB view details)

Uploaded CPython 3.7m Windows x86-64

tensorflow_io_nightly-0.20.0.dev20210902211216-cp37-cp37m-macosx_10_14_x86_64.whl (22.8 MB view details)

Uploaded CPython 3.7m macOS 10.14+ x86-64

tensorflow_io_nightly-0.20.0.dev20210902211216-cp36-cp36m-win_amd64.whl (21.4 MB view details)

Uploaded CPython 3.6m Windows x86-64

tensorflow_io_nightly-0.20.0.dev20210902211216-cp36-cp36m-macosx_10_14_x86_64.whl (22.8 MB view details)

Uploaded CPython 3.6m macOS 10.14+ x86-64

File details

Details for the file tensorflow_io_nightly-0.20.0.dev20210902211216-cp39-cp39-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.20.0.dev20210902211216-cp39-cp39-win_amd64.whl
Algorithm Hash digest
SHA256 bfbe7675a8d694cde9e0c139c0d6f7e2cfa39bb0f301d936638c4ea5f9546086
MD5 0d80a6c9538d34e8e7c8e10fea6d613e
BLAKE2b-256 299ae4873ef8b03f299728711b8ba8443e5ec6f5529345a9d52810c6603d1860

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.20.0.dev20210902211216-cp39-cp39-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.20.0.dev20210902211216-cp39-cp39-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 81e666c56db22a01328f8639fd381c48c6db83872764fc7f3fc24256999edb32
MD5 272e942ba4370d6f3d430329b0732889
BLAKE2b-256 e006b17e1a0c3911be307a3a617a3df77c4b9c0eb4eac561c554250649bca69b

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.20.0.dev20210902211216-cp39-cp39-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.20.0.dev20210902211216-cp39-cp39-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 e33c6bf4a801c8afdcef232ad6ac180791259d4bcef0fed9c6ac4ca72eb2fe34
MD5 7caac4b5a6a52314bfa7f38ebfa8fde6
BLAKE2b-256 87529de00421fb6b45bab67d16d089b0ead83d6ca78f984e98c699b953d2fd6a

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.20.0.dev20210902211216-cp38-cp38-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.20.0.dev20210902211216-cp38-cp38-win_amd64.whl
Algorithm Hash digest
SHA256 87fc2a0b2f5b84846bd8a832e5c5e659c75ce8880f84f490ab06a18d7ff44f62
MD5 81f94968298b454d1b7bf6505f37e2b1
BLAKE2b-256 471816362efe713cb75c948688b61f858acb9a203ea7b5f4b3a51d7749e2b2f3

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.20.0.dev20210902211216-cp38-cp38-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.20.0.dev20210902211216-cp38-cp38-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 24d3122fb2f2d15492072cc787c25fa4edc0ab61c79cb8035abc3f23ba0912e0
MD5 9a252554688b3401e709fc9898cc719c
BLAKE2b-256 f4b8421cf2c7ffbd251708ec2346bed9cfa72dd0801792adf8b0102dfe4dcce3

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.20.0.dev20210902211216-cp38-cp38-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.20.0.dev20210902211216-cp38-cp38-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 8f189bb8dbd81af3d2cf5be00237fb3425ad785e950f8db776d2f1f30361186e
MD5 f0c4d7954cda88ac07a1e1c9faca2308
BLAKE2b-256 93fb35002f44bdeaacdd9121c110f29aaf443d8ffb2ece84bee81469a1e7ebc2

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.20.0.dev20210902211216-cp37-cp37m-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.20.0.dev20210902211216-cp37-cp37m-win_amd64.whl
Algorithm Hash digest
SHA256 632e6f47fab748e9680d0088b27d321d90169df70517fb529d5095301e49de76
MD5 688a57d91fc0fe0a394d0e3302d88500
BLAKE2b-256 bd210a29e0e69ac5eed3d7f6a22173d57f64feef6836d39bc6b5dd9b5045bdf6

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.20.0.dev20210902211216-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.20.0.dev20210902211216-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 b106552bf7bd1232dfce81a0e427cfb188c39feea58997369e08e00c2900eac7
MD5 2b0e24e889e9fbdf361a14d5b29ec455
BLAKE2b-256 63d7770152aab43670ba3d14a717f922065d6cde1cf721e1234159c8d458c871

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.20.0.dev20210902211216-cp37-cp37m-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.20.0.dev20210902211216-cp37-cp37m-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 c0aa9e905835496e0bb7bff823828bc54bd20139424d086aae9229cd521175af
MD5 e1c02572a1b9f9ad2790c3a3cf05abe8
BLAKE2b-256 13280ce0ad8071df8162cbfc06a61da07db684d7f55943882984bec8bf3521aa

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.20.0.dev20210902211216-cp36-cp36m-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.20.0.dev20210902211216-cp36-cp36m-win_amd64.whl
Algorithm Hash digest
SHA256 8298ef4b191fdcd2fb7a2ac68478ab18bfb178d0e1c0b08ad2dbbf06cf41b434
MD5 a12c906ee4e1a6b56ed52ebf74d83a57
BLAKE2b-256 532fd202001659c6bb1a6d06c28b73f142fb87c3017e049345c9785a25b98e32

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.20.0.dev20210902211216-cp36-cp36m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.20.0.dev20210902211216-cp36-cp36m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 bfa902413ba5b389c762338ef615c7e77a72038d0b94cbefd9bc0a3f2c0e2189
MD5 6d97c91001866b189d823af4803736d3
BLAKE2b-256 2a59b8b7b5db4db1985ed6218360b2f4a74b955173f9a2fe2a7eac8c54ebe3b0

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.20.0.dev20210902211216-cp36-cp36m-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.20.0.dev20210902211216-cp36-cp36m-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 790bda1e43a6217d08317f264115a7e1a26e7ffb4564f2bfbedb363c82164eb1
MD5 e78884f1c4b35346f100edf691888102
BLAKE2b-256 d6ac8a4749b720bb9e9d3602e6c9ec1a11443b9e12e2ea9dc85cf188898fd38d

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page