Skip to main content

TensorFlow IO

Project description




TensorFlow I/O

GitHub CI PyPI License Documentation

TensorFlow I/O is a collection of file systems and file formats that are not available in TensorFlow's built-in support. A full list of supported file systems and file formats by TensorFlow I/O can be found here.

The use of tensorflow-io is straightforward with keras. Below is an example to Get Started with TensorFlow with the data processing aspect replaced by tensorflow-io:

import tensorflow as tf
import tensorflow_io as tfio

# Read the MNIST data into the IODataset.
dataset_url = "https://storage.googleapis.com/cvdf-datasets/mnist/"
d_train = tfio.IODataset.from_mnist(
    dataset_url + "train-images-idx3-ubyte.gz",
    dataset_url + "train-labels-idx1-ubyte.gz",
)

# Shuffle the elements of the dataset.
d_train = d_train.shuffle(buffer_size=1024)

# By default image data is uint8, so convert to float32 using map().
d_train = d_train.map(lambda x, y: (tf.image.convert_image_dtype(x, tf.float32), y))

# prepare batches the data just like any other tf.data.Dataset
d_train = d_train.batch(32)

# Build the model.
model = tf.keras.models.Sequential(
    [
        tf.keras.layers.Flatten(input_shape=(28, 28)),
        tf.keras.layers.Dense(512, activation=tf.nn.relu),
        tf.keras.layers.Dropout(0.2),
        tf.keras.layers.Dense(10, activation=tf.nn.softmax),
    ]
)

# Compile the model.
model.compile(
    optimizer="adam", loss="sparse_categorical_crossentropy", metrics=["accuracy"]
)

# Fit the model.
model.fit(d_train, epochs=5, steps_per_epoch=200)

In the above MNIST example, the URL's to access the dataset files are passed directly to the tfio.IODataset.from_mnist API call. This is due to the inherent support that tensorflow-io provides for HTTP/HTTPS file system, thus eliminating the need for downloading and saving datasets on a local directory.

NOTE: Since tensorflow-io is able to detect and uncompress the MNIST dataset automatically if needed, we can pass the URL's for the compressed files (gzip) to the API call as is.

Please check the official documentation for more detailed and interesting usages of the package.

Installation

Python Package

The tensorflow-io Python package can be installed with pip directly using:

$ pip install tensorflow-io

People who are a little more adventurous can also try our nightly binaries:

$ pip install tensorflow-io-nightly

Docker Images

In addition to the pip packages, the docker images can be used to quickly get started.

For stable builds:

$ docker pull tfsigio/tfio:latest
$ docker run -it --rm --name tfio-latest tfsigio/tfio:latest

For nightly builds:

$ docker pull tfsigio/tfio:nightly
$ docker run -it --rm --name tfio-nightly tfsigio/tfio:nightly

R Package

Once the tensorflow-io Python package has been successfully installed, you can install the development version of the R package from GitHub via the following:

if (!require("remotes")) install.packages("remotes")
remotes::install_github("tensorflow/io", subdir = "R-package")

TensorFlow Version Compatibility

To ensure compatibility with TensorFlow, it is recommended to install a matching version of TensorFlow I/O according to the table below. You can find the list of releases here.

TensorFlow I/O Version TensorFlow Compatibility Release Date
0.17.1 2.4.x Apr 16, 2021
0.17.0 2.4.x Dec 14, 2020
0.16.0 2.3.x Oct 23, 2020
0.15.0 2.3.x Aug 03, 2020
0.14.0 2.2.x Jul 08, 2020
0.13.0 2.2.x May 10, 2020
0.12.0 2.1.x Feb 28, 2020
0.11.0 2.1.x Jan 10, 2020
0.10.0 2.0.x Dec 05, 2019
0.9.1 2.0.x Nov 15, 2019
0.9.0 2.0.x Oct 18, 2019
0.8.1 1.15.x Nov 15, 2019
0.8.0 1.15.x Oct 17, 2019
0.7.2 1.14.x Nov 15, 2019
0.7.1 1.14.x Oct 18, 2019
0.7.0 1.14.x Jul 14, 2019
0.6.0 1.13.x May 29, 2019
0.5.0 1.13.x Apr 12, 2019
0.4.0 1.13.x Mar 01, 2019
0.3.0 1.12.0 Feb 15, 2019
0.2.0 1.12.0 Jan 29, 2019
0.1.0 1.12.0 Dec 16, 2018

Performance Benchmarking

We use github-pages to document the results of API performance benchmarks. The benchmark job is triggered on every commit to master branch and facilitates tracking performance w.r.t commits.

Contributing

Tensorflow I/O is a community led open source project. As such, the project depends on public contributions, bug-fixes, and documentation. Please see:

Build Status and CI

Build Status
Linux CPU Python 2 Status
Linux CPU Python 3 Status
Linux GPU Python 2 Status
Linux GPU Python 3 Status

Because of manylinux2010 requirement, TensorFlow I/O is built with Ubuntu:16.04 + Developer Toolset 7 (GCC 7.3) on Linux. Configuration with Ubuntu 16.04 with Developer Toolset 7 is not exactly straightforward. If the system have docker installed, then the following command will automatically build manylinux2010 compatible whl package:

#!/usr/bin/env bash

ls dist/*
for f in dist/*.whl; do
  docker run -i --rm -v $PWD:/v -w /v --net=host quay.io/pypa/manylinux2010_x86_64 bash -x -e /v/tools/build/auditwheel repair --plat manylinux2010_x86_64 $f
done
sudo chown -R $(id -nu):$(id -ng) .
ls wheelhouse/*

It takes some time to build, but once complete, there will be python 3.5, 3.6, 3.7 compatible whl packages available in wheelhouse directory.

On macOS, the same command could be used. However, the script expects python in shell and will only generate a whl package that matches the version of python in shell. If you want to build a whl package for a specific python then you have to alias this version of python to python in shell. See .github/workflows/build.yml Auditwheel step for instructions how to do that.

Note the above command is also the command we use when releasing packages for Linux and macOS.

TensorFlow I/O uses both GitHub Workflows and Google CI (Kokoro) for continuous integration. GitHub Workflows is used for macOS build and test. Kokoro is used for Linux build and test. Again, because of the manylinux2010 requirement, on Linux whl packages are always built with Ubuntu 16.04 + Developer Toolset 7. Tests are done on a variatiy of systems with different python3 versions to ensure a good coverage:

Python Ubuntu 18.04 Ubuntu 20.04 macOS + osx9 Windows-2019
2.7 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: N/A
3.7 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: :heavy_check_mark:
3.8 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: :heavy_check_mark:

TensorFlow I/O has integrations with many systems and cloud vendors such as Prometheus, Apache Kafka, Apache Ignite, Google Cloud PubSub, AWS Kinesis, Microsoft Azure Storage, Alibaba Cloud OSS etc.

We tried our best to test against those systems in our continuous integration whenever possible. Some tests such as Prometheus, Kafka, and Ignite are done with live systems, meaning we install Prometheus/Kafka/Ignite on CI machine before the test is run. Some tests such as Kinesis, PubSub, and Azure Storage are done through official or non-official emulators. Offline tests are also performed whenever possible, though systems covered through offine tests may not have the same level of coverage as live systems or emulators.

Live System Emulator CI Integration Offline
Apache Kafka :heavy_check_mark: :heavy_check_mark:
Apache Ignite :heavy_check_mark: :heavy_check_mark:
Prometheus :heavy_check_mark: :heavy_check_mark:
Google PubSub :heavy_check_mark: :heavy_check_mark:
Azure Storage :heavy_check_mark: :heavy_check_mark:
AWS Kinesis :heavy_check_mark: :heavy_check_mark:
Alibaba Cloud OSS :heavy_check_mark:
Google BigTable/BigQuery to be added
Elasticsearch (experimental) :heavy_check_mark: :heavy_check_mark:
MongoDB (experimental) :heavy_check_mark: :heavy_check_mark:

References for emulators:

Community

Additional Information

License

Apache License 2.0

Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

tensorflow_io_nightly-0.18.0.dev20210502151436-cp39-cp39-win_amd64.whl (20.6 MB view details)

Uploaded CPython 3.9 Windows x86-64

tensorflow_io_nightly-0.18.0.dev20210502151436-cp39-cp39-manylinux2010_x86_64.whl (24.0 MB view details)

Uploaded CPython 3.9 manylinux: glibc 2.12+ x86-64

tensorflow_io_nightly-0.18.0.dev20210502151436-cp39-cp39-macosx_10_14_x86_64.whl (21.0 MB view details)

Uploaded CPython 3.9 macOS 10.14+ x86-64

tensorflow_io_nightly-0.18.0.dev20210502151436-cp38-cp38-win_amd64.whl (20.6 MB view details)

Uploaded CPython 3.8 Windows x86-64

tensorflow_io_nightly-0.18.0.dev20210502151436-cp38-cp38-manylinux2010_x86_64.whl (24.0 MB view details)

Uploaded CPython 3.8 manylinux: glibc 2.12+ x86-64

tensorflow_io_nightly-0.18.0.dev20210502151436-cp38-cp38-macosx_10_14_x86_64.whl (21.0 MB view details)

Uploaded CPython 3.8 macOS 10.14+ x86-64

tensorflow_io_nightly-0.18.0.dev20210502151436-cp37-cp37m-win_amd64.whl (20.6 MB view details)

Uploaded CPython 3.7m Windows x86-64

tensorflow_io_nightly-0.18.0.dev20210502151436-cp37-cp37m-manylinux2010_x86_64.whl (24.0 MB view details)

Uploaded CPython 3.7m manylinux: glibc 2.12+ x86-64

tensorflow_io_nightly-0.18.0.dev20210502151436-cp37-cp37m-macosx_10_14_x86_64.whl (21.0 MB view details)

Uploaded CPython 3.7m macOS 10.14+ x86-64

tensorflow_io_nightly-0.18.0.dev20210502151436-cp36-cp36m-win_amd64.whl (20.6 MB view details)

Uploaded CPython 3.6m Windows x86-64

tensorflow_io_nightly-0.18.0.dev20210502151436-cp36-cp36m-manylinux2010_x86_64.whl (24.0 MB view details)

Uploaded CPython 3.6m manylinux: glibc 2.12+ x86-64

tensorflow_io_nightly-0.18.0.dev20210502151436-cp36-cp36m-macosx_10_14_x86_64.whl (21.0 MB view details)

Uploaded CPython 3.6m macOS 10.14+ x86-64

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210502151436-cp39-cp39-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210502151436-cp39-cp39-win_amd64.whl
Algorithm Hash digest
SHA256 ac67363cb8cc1ecd39c1d213a9626de236f8ebc2b5012bababd7ca1358e27073
MD5 febc678d0a3cb338c7864e831df57f1f
BLAKE2b-256 3a6189b4f0ca1653386cf0f3ca2c3d228277c79fc5ff5e9ae4b8349d1e556d6b

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210502151436-cp39-cp39-manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210502151436-cp39-cp39-manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 7a6ecc7863f2759799a46c89c49951a5099fc77b17059cf5b83f72a561313de6
MD5 f60785a1f215fe23786c880cad1150a5
BLAKE2b-256 34350902a3eced3c02eeb21e9c75af092b6eb008e6fe48e6287ef3d0d4d81713

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210502151436-cp39-cp39-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210502151436-cp39-cp39-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 eb664012700481685916d1e95137481426283169ca58fcad92a50854b613d80f
MD5 eff58e71a3a4fc8dcbd23762b3c3de02
BLAKE2b-256 26d3a8ef84b93c894a9992bdb78e4ff6bf3041e5cb2ba3a149e1f73592543f46

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210502151436-cp38-cp38-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210502151436-cp38-cp38-win_amd64.whl
Algorithm Hash digest
SHA256 5461e16834019a918da5a19d69828afcae5fa451889052cee616893d6921001f
MD5 7e150ae58850cea051182433287c2380
BLAKE2b-256 b18e27a1821847f0e77d19d9fe706a8f1f640292644d6704899c3ba2ec84c650

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210502151436-cp38-cp38-manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210502151436-cp38-cp38-manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 ff5384b804218682c2fb6507620818ed6eb673c5a4623eda12826b84c70b2091
MD5 2894c8930d55122202afab7ad6c9828e
BLAKE2b-256 b631581d885681560910caaa1e94642b44b43c27d923f283345b5db55a1948f4

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210502151436-cp38-cp38-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210502151436-cp38-cp38-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 b407615806f8f1e8bf1d6821b94c48acc2e6674088310b4d0aa76bafb74dbec4
MD5 9022a462581e801a335cd4369fe1620d
BLAKE2b-256 3a2e174eca16ca1e54859d507fa343c233e6cb5d8823b00de66466e61714fd5f

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210502151436-cp37-cp37m-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210502151436-cp37-cp37m-win_amd64.whl
Algorithm Hash digest
SHA256 371561bce46ed4fd3ecc13b79a8ad2e3b4b3eb7476165762d6c2a03563d15950
MD5 7abdf986719b43efa61b027933501b3f
BLAKE2b-256 3f9017316db0b1ba7a1a290ef9e7909e3d95b1c22811abecb5a026884b9bb504

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210502151436-cp37-cp37m-manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210502151436-cp37-cp37m-manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 13e447394904bb3723e0d929b446f91189da9aedc9630836ca7cc50c8007aab3
MD5 d5ac8a1cfe5ea21ca62d011f02426d4f
BLAKE2b-256 93e3422e1a3fbbd4f8bf7a681ab8b6dbbc78f6002cf6ba2659fb2639bfdb2745

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210502151436-cp37-cp37m-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210502151436-cp37-cp37m-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 964174285736b0fed0c0829be463fdf940a3f69546325b9d903f90523a9a13d4
MD5 bde4ffd2e09451d2f96ce9b938517a88
BLAKE2b-256 6dcdf3a795f804afe19a127a0310dff7064901c68539c6dba9ccbe23b337c536

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210502151436-cp36-cp36m-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210502151436-cp36-cp36m-win_amd64.whl
Algorithm Hash digest
SHA256 db26258854d061b4565ee268e270ad7bfb884e8c7399d016936002f24003f1d4
MD5 fc406ced563344792755300ee3d748ea
BLAKE2b-256 327417226d9d6f6722ab040a2160b115ddc279297ff4b04c87c871232193f7eb

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210502151436-cp36-cp36m-manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210502151436-cp36-cp36m-manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 7abc95cff7905ef3c8a2cfaaa3ede591b25b3b3d517ef4cf5debc76f320117c1
MD5 3147d1bc017c92f7b8e3368cdeb0667c
BLAKE2b-256 e8f5738347958fa22187e514b8aecf8f7be860625c9d74a8df79ab3905d15035

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210502151436-cp36-cp36m-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210502151436-cp36-cp36m-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 a2f8e0c9b06440d2b11b8d1807a671b556874a619db03124781e54df79617bc3
MD5 2ac965820edca69f3b474d0e00714875
BLAKE2b-256 4b764d7f5d6dfe9591bcef5f06858afe4860e84bec4212881f1462cc72809f8d

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page