Skip to main content

TensorFlow IO

Project description




TensorFlow I/O

GitHub CI PyPI License Documentation

TensorFlow I/O is a collection of file systems and file formats that are not available in TensorFlow's built-in support. A full list of supported file systems and file formats by TensorFlow I/O can be found here.

The use of tensorflow-io is straightforward with keras. Below is an example to Get Started with TensorFlow with the data processing aspect replaced by tensorflow-io:

import tensorflow as tf
import tensorflow_io as tfio

# Read the MNIST data into the IODataset.
dataset_url = "https://storage.googleapis.com/cvdf-datasets/mnist/"
d_train = tfio.IODataset.from_mnist(
    dataset_url + "train-images-idx3-ubyte.gz",
    dataset_url + "train-labels-idx1-ubyte.gz",
)

# Shuffle the elements of the dataset.
d_train = d_train.shuffle(buffer_size=1024)

# By default image data is uint8, so convert to float32 using map().
d_train = d_train.map(lambda x, y: (tf.image.convert_image_dtype(x, tf.float32), y))

# prepare batches the data just like any other tf.data.Dataset
d_train = d_train.batch(32)

# Build the model.
model = tf.keras.models.Sequential(
    [
        tf.keras.layers.Flatten(input_shape=(28, 28)),
        tf.keras.layers.Dense(512, activation=tf.nn.relu),
        tf.keras.layers.Dropout(0.2),
        tf.keras.layers.Dense(10, activation=tf.nn.softmax),
    ]
)

# Compile the model.
model.compile(
    optimizer="adam", loss="sparse_categorical_crossentropy", metrics=["accuracy"]
)

# Fit the model.
model.fit(d_train, epochs=5, steps_per_epoch=200)

In the above MNIST example, the URL's to access the dataset files are passed directly to the tfio.IODataset.from_mnist API call. This is due to the inherent support that tensorflow-io provides for HTTP/HTTPS file system, thus eliminating the need for downloading and saving datasets on a local directory.

NOTE: Since tensorflow-io is able to detect and uncompress the MNIST dataset automatically if needed, we can pass the URL's for the compressed files (gzip) to the API call as is.

Please check the official documentation for more detailed and interesting usages of the package.

Installation

Python Package

The tensorflow-io Python package can be installed with pip directly using:

$ pip install tensorflow-io

People who are a little more adventurous can also try our nightly binaries:

$ pip install tensorflow-io-nightly

Docker Images

In addition to the pip packages, the docker images can be used to quickly get started.

For stable builds:

$ docker pull tfsigio/tfio:latest
$ docker run -it --rm --name tfio-latest tfsigio/tfio:latest

For nightly builds:

$ docker pull tfsigio/tfio:nightly
$ docker run -it --rm --name tfio-nightly tfsigio/tfio:nightly

R Package

Once the tensorflow-io Python package has been successfully installed, you can install the development version of the R package from GitHub via the following:

if (!require("remotes")) install.packages("remotes")
remotes::install_github("tensorflow/io", subdir = "R-package")

TensorFlow Version Compatibility

To ensure compatibility with TensorFlow, it is recommended to install a matching version of TensorFlow I/O according to the table below. You can find the list of releases here.

TensorFlow I/O Version TensorFlow Compatibility Release Date
0.17.1 2.4.x Apr 16, 2021
0.17.0 2.4.x Dec 14, 2020
0.16.0 2.3.x Oct 23, 2020
0.15.0 2.3.x Aug 03, 2020
0.14.0 2.2.x Jul 08, 2020
0.13.0 2.2.x May 10, 2020
0.12.0 2.1.x Feb 28, 2020
0.11.0 2.1.x Jan 10, 2020
0.10.0 2.0.x Dec 05, 2019
0.9.1 2.0.x Nov 15, 2019
0.9.0 2.0.x Oct 18, 2019
0.8.1 1.15.x Nov 15, 2019
0.8.0 1.15.x Oct 17, 2019
0.7.2 1.14.x Nov 15, 2019
0.7.1 1.14.x Oct 18, 2019
0.7.0 1.14.x Jul 14, 2019
0.6.0 1.13.x May 29, 2019
0.5.0 1.13.x Apr 12, 2019
0.4.0 1.13.x Mar 01, 2019
0.3.0 1.12.0 Feb 15, 2019
0.2.0 1.12.0 Jan 29, 2019
0.1.0 1.12.0 Dec 16, 2018

Performance Benchmarking

We use github-pages to document the results of API performance benchmarks. The benchmark job is triggered on every commit to master branch and facilitates tracking performance w.r.t commits.

Contributing

Tensorflow I/O is a community led open source project. As such, the project depends on public contributions, bug-fixes, and documentation. Please see:

Build Status and CI

Build Status
Linux CPU Python 2 Status
Linux CPU Python 3 Status
Linux GPU Python 2 Status
Linux GPU Python 3 Status

Because of manylinux2010 requirement, TensorFlow I/O is built with Ubuntu:16.04 + Developer Toolset 7 (GCC 7.3) on Linux. Configuration with Ubuntu 16.04 with Developer Toolset 7 is not exactly straightforward. If the system have docker installed, then the following command will automatically build manylinux2010 compatible whl package:

#!/usr/bin/env bash

ls dist/*
for f in dist/*.whl; do
  docker run -i --rm -v $PWD:/v -w /v --net=host quay.io/pypa/manylinux2010_x86_64 bash -x -e /v/tools/build/auditwheel repair --plat manylinux2010_x86_64 $f
done
sudo chown -R $(id -nu):$(id -ng) .
ls wheelhouse/*

It takes some time to build, but once complete, there will be python 3.5, 3.6, 3.7 compatible whl packages available in wheelhouse directory.

On macOS, the same command could be used. However, the script expects python in shell and will only generate a whl package that matches the version of python in shell. If you want to build a whl package for a specific python then you have to alias this version of python to python in shell. See .github/workflows/build.yml Auditwheel step for instructions how to do that.

Note the above command is also the command we use when releasing packages for Linux and macOS.

TensorFlow I/O uses both GitHub Workflows and Google CI (Kokoro) for continuous integration. GitHub Workflows is used for macOS build and test. Kokoro is used for Linux build and test. Again, because of the manylinux2010 requirement, on Linux whl packages are always built with Ubuntu 16.04 + Developer Toolset 7. Tests are done on a variatiy of systems with different python3 versions to ensure a good coverage:

Python Ubuntu 18.04 Ubuntu 20.04 macOS + osx9 Windows-2019
2.7 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: N/A
3.7 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: :heavy_check_mark:
3.8 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: :heavy_check_mark:

TensorFlow I/O has integrations with many systems and cloud vendors such as Prometheus, Apache Kafka, Apache Ignite, Google Cloud PubSub, AWS Kinesis, Microsoft Azure Storage, Alibaba Cloud OSS etc.

We tried our best to test against those systems in our continuous integration whenever possible. Some tests such as Prometheus, Kafka, and Ignite are done with live systems, meaning we install Prometheus/Kafka/Ignite on CI machine before the test is run. Some tests such as Kinesis, PubSub, and Azure Storage are done through official or non-official emulators. Offline tests are also performed whenever possible, though systems covered through offine tests may not have the same level of coverage as live systems or emulators.

Live System Emulator CI Integration Offline
Apache Kafka :heavy_check_mark: :heavy_check_mark:
Apache Ignite :heavy_check_mark: :heavy_check_mark:
Prometheus :heavy_check_mark: :heavy_check_mark:
Google PubSub :heavy_check_mark: :heavy_check_mark:
Azure Storage :heavy_check_mark: :heavy_check_mark:
AWS Kinesis :heavy_check_mark: :heavy_check_mark:
Alibaba Cloud OSS :heavy_check_mark:
Google BigTable/BigQuery to be added
Elasticsearch (experimental) :heavy_check_mark: :heavy_check_mark:
MongoDB (experimental) :heavy_check_mark: :heavy_check_mark:

References for emulators:

Community

Additional Information

License

Apache License 2.0

Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

tensorflow_io_nightly-0.18.0.dev20210505132906-cp39-cp39-win_amd64.whl (20.6 MB view details)

Uploaded CPython 3.9 Windows x86-64

tensorflow_io_nightly-0.18.0.dev20210505132906-cp39-cp39-manylinux2010_x86_64.whl (24.1 MB view details)

Uploaded CPython 3.9 manylinux: glibc 2.12+ x86-64

tensorflow_io_nightly-0.18.0.dev20210505132906-cp39-cp39-macosx_10_14_x86_64.whl (21.1 MB view details)

Uploaded CPython 3.9 macOS 10.14+ x86-64

tensorflow_io_nightly-0.18.0.dev20210505132906-cp38-cp38-win_amd64.whl (20.6 MB view details)

Uploaded CPython 3.8 Windows x86-64

tensorflow_io_nightly-0.18.0.dev20210505132906-cp38-cp38-manylinux2010_x86_64.whl (24.1 MB view details)

Uploaded CPython 3.8 manylinux: glibc 2.12+ x86-64

tensorflow_io_nightly-0.18.0.dev20210505132906-cp38-cp38-macosx_10_14_x86_64.whl (21.1 MB view details)

Uploaded CPython 3.8 macOS 10.14+ x86-64

tensorflow_io_nightly-0.18.0.dev20210505132906-cp37-cp37m-win_amd64.whl (20.6 MB view details)

Uploaded CPython 3.7m Windows x86-64

tensorflow_io_nightly-0.18.0.dev20210505132906-cp37-cp37m-manylinux2010_x86_64.whl (24.1 MB view details)

Uploaded CPython 3.7m manylinux: glibc 2.12+ x86-64

tensorflow_io_nightly-0.18.0.dev20210505132906-cp37-cp37m-macosx_10_14_x86_64.whl (21.1 MB view details)

Uploaded CPython 3.7m macOS 10.14+ x86-64

tensorflow_io_nightly-0.18.0.dev20210505132906-cp36-cp36m-win_amd64.whl (20.6 MB view details)

Uploaded CPython 3.6m Windows x86-64

tensorflow_io_nightly-0.18.0.dev20210505132906-cp36-cp36m-manylinux2010_x86_64.whl (24.1 MB view details)

Uploaded CPython 3.6m manylinux: glibc 2.12+ x86-64

tensorflow_io_nightly-0.18.0.dev20210505132906-cp36-cp36m-macosx_10_14_x86_64.whl (21.1 MB view details)

Uploaded CPython 3.6m macOS 10.14+ x86-64

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210505132906-cp39-cp39-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210505132906-cp39-cp39-win_amd64.whl
Algorithm Hash digest
SHA256 9e00ff96c79b646ca1159218141d0fa2c66699edc52ead3edcacb1cd7be17111
MD5 a109ca35cc15e54415c365e19f264e03
BLAKE2b-256 4a29d03acae6301dd2d2c18c881d4be23983bc1f900d6c8ad59f3a26d651a379

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210505132906-cp39-cp39-manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210505132906-cp39-cp39-manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 049290b9eb517c783a0cbdcbded25d93064cf3660c014661eba4584915376e1d
MD5 59643b12ff4634da0399fca4122b7041
BLAKE2b-256 ab587ee77d0a057857fe84567db182c6efbb3f7a2c270358087250760571c58c

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210505132906-cp39-cp39-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210505132906-cp39-cp39-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 dc65ce6043de814c30b7c16ceb5d21d5337a83a952aad7614adc4165cf67b6af
MD5 a40d1769bd1ec45eb68f80a828921eeb
BLAKE2b-256 fe4cef6e04d0d5ab5417fa97bd59696ac4ab5ecd804b87d187c541cc8fe1ce86

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210505132906-cp38-cp38-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210505132906-cp38-cp38-win_amd64.whl
Algorithm Hash digest
SHA256 65383dfeaef4411d507895438f68b3120e94a59add445fe1f5d5415f672a501d
MD5 5d910fc0b1b3ae4ea5b916d92631fbd4
BLAKE2b-256 fe01f3a11768d48de2fe5f6509888fadcc4af34802c8788a62392f5a0b37de41

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210505132906-cp38-cp38-manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210505132906-cp38-cp38-manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 66c5075caca1a24e4305b4423cb3bb0164b4d6dfb61ce095ff9d56693d38d585
MD5 0add2d67ebb448da9e9850ab25fbbf03
BLAKE2b-256 24d54ed5fe50898aa4891170ed59e832c1390db3f0e825edeb67625ff1d90a54

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210505132906-cp38-cp38-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210505132906-cp38-cp38-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 59a2869e315acaa956d26983d46103fde1e1714625bcb6e5f8734aa1b4359914
MD5 e7e1f37bfcf11df06f9903516e89abbe
BLAKE2b-256 f82e860f990b1ed0ee4fb5c107a5cba9860c374baea27254406be3196253072d

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210505132906-cp37-cp37m-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210505132906-cp37-cp37m-win_amd64.whl
Algorithm Hash digest
SHA256 9b13dffd7e7725fe2f5e7eb75b0dcbdb014bf8cd4826acf7273a8453c69558ce
MD5 0634cb7f9843ca88bb01f6123496f4ba
BLAKE2b-256 273592d8d1d6ffe93dfaef31fb94dffa17328c28b703ff9458d0168e65f20476

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210505132906-cp37-cp37m-manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210505132906-cp37-cp37m-manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 603c3f18b7e3799e412c524268d908c9e84749bd58ab241d17d0ba8c4f2ec2c1
MD5 c5c1e184dd85b5b9d5d9e2584204360b
BLAKE2b-256 b7a8e5645dd781cd7a8da0264fa37a921d8825001e21f36e8b6a8abd0630a0ac

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210505132906-cp37-cp37m-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210505132906-cp37-cp37m-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 e4b8aa269ec1ff2ad5e62cdc4f619c20af6a4f71805857d2485cb9adcf1e01bd
MD5 ae108671d00f95e575c18f66dfe0e523
BLAKE2b-256 41417eaccbe9b9ed88e15299dc9d3876237354a93951856f447a12461531d12a

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210505132906-cp36-cp36m-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210505132906-cp36-cp36m-win_amd64.whl
Algorithm Hash digest
SHA256 e68fb2476a470cbf8b69ffc523568cf45a1adf8c100cf32deb19c37d97c9dca2
MD5 cf336d9388a433674a7e4e4731b134a9
BLAKE2b-256 1273b13c3847b0d0d4ba2b1bb9d9e2299f73bd3d47c85a1badad0daa4d060c74

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210505132906-cp36-cp36m-manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210505132906-cp36-cp36m-manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 b26606481c45722340b0adcefc25c47aecd7e1943be47928ce3f2309110bd867
MD5 ef21f18f5e0f702b7e6702faab315678
BLAKE2b-256 2ddd959e155ad88a5b8cddac7594d08372f0dbc3e2fe0c47160f30e1153b4ca4

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210505132906-cp36-cp36m-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210505132906-cp36-cp36m-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 62a239a7ac2370506eb2736be40c3ad68b331c8762ea28f9bb59558afd8ea752
MD5 0df41264f1825fd602573e669ad40bbf
BLAKE2b-256 67179ffb936c87f66edfe4a9b0fa530296d4c67ffd4d38a732aaa815192a17e2

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page