Skip to main content

TensorFlow IO

Project description




TensorFlow I/O

GitHub CI PyPI License Documentation

TensorFlow I/O is a collection of file systems and file formats that are not available in TensorFlow's built-in support. A full list of supported file systems and file formats by TensorFlow I/O can be found here.

The use of tensorflow-io is straightforward with keras. Below is an example to Get Started with TensorFlow with the data processing aspect replaced by tensorflow-io:

import tensorflow as tf
import tensorflow_io as tfio

# Read the MNIST data into the IODataset.
dataset_url = "https://storage.googleapis.com/cvdf-datasets/mnist/"
d_train = tfio.IODataset.from_mnist(
    dataset_url + "train-images-idx3-ubyte.gz",
    dataset_url + "train-labels-idx1-ubyte.gz",
)

# Shuffle the elements of the dataset.
d_train = d_train.shuffle(buffer_size=1024)

# By default image data is uint8, so convert to float32 using map().
d_train = d_train.map(lambda x, y: (tf.image.convert_image_dtype(x, tf.float32), y))

# prepare batches the data just like any other tf.data.Dataset
d_train = d_train.batch(32)

# Build the model.
model = tf.keras.models.Sequential(
    [
        tf.keras.layers.Flatten(input_shape=(28, 28)),
        tf.keras.layers.Dense(512, activation=tf.nn.relu),
        tf.keras.layers.Dropout(0.2),
        tf.keras.layers.Dense(10, activation=tf.nn.softmax),
    ]
)

# Compile the model.
model.compile(
    optimizer="adam", loss="sparse_categorical_crossentropy", metrics=["accuracy"]
)

# Fit the model.
model.fit(d_train, epochs=5, steps_per_epoch=200)

In the above MNIST example, the URL's to access the dataset files are passed directly to the tfio.IODataset.from_mnist API call. This is due to the inherent support that tensorflow-io provides for HTTP/HTTPS file system, thus eliminating the need for downloading and saving datasets on a local directory.

NOTE: Since tensorflow-io is able to detect and uncompress the MNIST dataset automatically if needed, we can pass the URL's for the compressed files (gzip) to the API call as is.

Please check the official documentation for more detailed and interesting usages of the package.

Installation

Python Package

The tensorflow-io Python package can be installed with pip directly using:

$ pip install tensorflow-io

People who are a little more adventurous can also try our nightly binaries:

$ pip install tensorflow-io-nightly

To ensure you have a version of TensorFlow that is compatible with TensorFlow-IO, you can specify the tensorflow extra requirement during install:

pip install tensorflow-io[tensorflow]

Similar extras exist for the tensorflow-gpu, tensorflow-cpu and tensorflow-rocm packages.

Docker Images

In addition to the pip packages, the docker images can be used to quickly get started.

For stable builds:

$ docker pull tfsigio/tfio:latest
$ docker run -it --rm --name tfio-latest tfsigio/tfio:latest

For nightly builds:

$ docker pull tfsigio/tfio:nightly
$ docker run -it --rm --name tfio-nightly tfsigio/tfio:nightly

R Package

Once the tensorflow-io Python package has been successfully installed, you can install the development version of the R package from GitHub via the following:

if (!require("remotes")) install.packages("remotes")
remotes::install_github("tensorflow/io", subdir = "R-package")

TensorFlow Version Compatibility

To ensure compatibility with TensorFlow, it is recommended to install a matching version of TensorFlow I/O according to the table below. You can find the list of releases here.

TensorFlow I/O Version TensorFlow Compatibility Release Date
0.23.1 2.7.x Dec 15, 2021
0.23.0 2.7.x Dec 14, 2021
0.22.0 2.7.x Nov 10, 2021
0.21.0 2.6.x Sep 12, 2021
0.20.0 2.6.x Aug 11, 2021
0.19.1 2.5.x Jul 25, 2021
0.19.0 2.5.x Jun 25, 2021
0.18.0 2.5.x May 13, 2021
0.17.1 2.4.x Apr 16, 2021
0.17.0 2.4.x Dec 14, 2020
0.16.0 2.3.x Oct 23, 2020
0.15.0 2.3.x Aug 03, 2020
0.14.0 2.2.x Jul 08, 2020
0.13.0 2.2.x May 10, 2020
0.12.0 2.1.x Feb 28, 2020
0.11.0 2.1.x Jan 10, 2020
0.10.0 2.0.x Dec 05, 2019
0.9.1 2.0.x Nov 15, 2019
0.9.0 2.0.x Oct 18, 2019
0.8.1 1.15.x Nov 15, 2019
0.8.0 1.15.x Oct 17, 2019
0.7.2 1.14.x Nov 15, 2019
0.7.1 1.14.x Oct 18, 2019
0.7.0 1.14.x Jul 14, 2019
0.6.0 1.13.x May 29, 2019
0.5.0 1.13.x Apr 12, 2019
0.4.0 1.13.x Mar 01, 2019
0.3.0 1.12.0 Feb 15, 2019
0.2.0 1.12.0 Jan 29, 2019
0.1.0 1.12.0 Dec 16, 2018

Performance Benchmarking

We use github-pages to document the results of API performance benchmarks. The benchmark job is triggered on every commit to master branch and facilitates tracking performance w.r.t commits.

Contributing

Tensorflow I/O is a community led open source project. As such, the project depends on public contributions, bug-fixes, and documentation. Please see:

Build Status and CI

Build Status
Linux CPU Python 2 Status
Linux CPU Python 3 Status
Linux GPU Python 2 Status
Linux GPU Python 3 Status

Because of manylinux2010 requirement, TensorFlow I/O is built with Ubuntu:16.04 + Developer Toolset 7 (GCC 7.3) on Linux. Configuration with Ubuntu 16.04 with Developer Toolset 7 is not exactly straightforward. If the system have docker installed, then the following command will automatically build manylinux2010 compatible whl package:

#!/usr/bin/env bash

ls dist/*
for f in dist/*.whl; do
  docker run -i --rm -v $PWD:/v -w /v --net=host quay.io/pypa/manylinux2010_x86_64 bash -x -e /v/tools/build/auditwheel repair --plat manylinux2010_x86_64 $f
done
sudo chown -R $(id -nu):$(id -ng) .
ls wheelhouse/*

It takes some time to build, but once complete, there will be python 3.5, 3.6, 3.7 compatible whl packages available in wheelhouse directory.

On macOS, the same command could be used. However, the script expects python in shell and will only generate a whl package that matches the version of python in shell. If you want to build a whl package for a specific python then you have to alias this version of python to python in shell. See .github/workflows/build.yml Auditwheel step for instructions how to do that.

Note the above command is also the command we use when releasing packages for Linux and macOS.

TensorFlow I/O uses both GitHub Workflows and Google CI (Kokoro) for continuous integration. GitHub Workflows is used for macOS build and test. Kokoro is used for Linux build and test. Again, because of the manylinux2010 requirement, on Linux whl packages are always built with Ubuntu 16.04 + Developer Toolset 7. Tests are done on a variatiy of systems with different python3 versions to ensure a good coverage:

Python Ubuntu 18.04 Ubuntu 20.04 macOS + osx9 Windows-2019
2.7 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: N/A
3.7 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: :heavy_check_mark:
3.8 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: :heavy_check_mark:

TensorFlow I/O has integrations with many systems and cloud vendors such as Prometheus, Apache Kafka, Apache Ignite, Google Cloud PubSub, AWS Kinesis, Microsoft Azure Storage, Alibaba Cloud OSS etc.

We tried our best to test against those systems in our continuous integration whenever possible. Some tests such as Prometheus, Kafka, and Ignite are done with live systems, meaning we install Prometheus/Kafka/Ignite on CI machine before the test is run. Some tests such as Kinesis, PubSub, and Azure Storage are done through official or non-official emulators. Offline tests are also performed whenever possible, though systems covered through offine tests may not have the same level of coverage as live systems or emulators.

Live System Emulator CI Integration Offline
Apache Kafka :heavy_check_mark: :heavy_check_mark:
Apache Ignite :heavy_check_mark: :heavy_check_mark:
Prometheus :heavy_check_mark: :heavy_check_mark:
Google PubSub :heavy_check_mark: :heavy_check_mark:
Azure Storage :heavy_check_mark: :heavy_check_mark:
AWS Kinesis :heavy_check_mark: :heavy_check_mark:
Alibaba Cloud OSS :heavy_check_mark:
Google BigTable/BigQuery to be added
Elasticsearch (experimental) :heavy_check_mark: :heavy_check_mark:
MongoDB (experimental) :heavy_check_mark: :heavy_check_mark:

References for emulators:

Community

Additional Information

License

Apache License 2.0

Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

tensorflow_io_nightly-0.24.0.dev20220103232431-cp310-cp310-win_amd64.whl (21.4 MB view details)

Uploaded CPython 3.10 Windows x86-64

tensorflow_io_nightly-0.24.0.dev20220103232431-cp310-cp310-macosx_10_14_x86_64.whl (23.8 MB view details)

Uploaded CPython 3.10 macOS 10.14+ x86-64

tensorflow_io_nightly-0.24.0.dev20220103232431-cp39-cp39-win_amd64.whl (21.4 MB view details)

Uploaded CPython 3.9 Windows x86-64

tensorflow_io_nightly-0.24.0.dev20220103232431-cp39-cp39-macosx_10_14_x86_64.whl (23.8 MB view details)

Uploaded CPython 3.9 macOS 10.14+ x86-64

tensorflow_io_nightly-0.24.0.dev20220103232431-cp38-cp38-win_amd64.whl (21.4 MB view details)

Uploaded CPython 3.8 Windows x86-64

tensorflow_io_nightly-0.24.0.dev20220103232431-cp38-cp38-macosx_10_14_x86_64.whl (23.8 MB view details)

Uploaded CPython 3.8 macOS 10.14+ x86-64

tensorflow_io_nightly-0.24.0.dev20220103232431-cp37-cp37m-win_amd64.whl (21.4 MB view details)

Uploaded CPython 3.7m Windows x86-64

tensorflow_io_nightly-0.24.0.dev20220103232431-cp37-cp37m-macosx_10_14_x86_64.whl (23.8 MB view details)

Uploaded CPython 3.7m macOS 10.14+ x86-64

File details

Details for the file tensorflow_io_nightly-0.24.0.dev20220103232431-cp310-cp310-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.24.0.dev20220103232431-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 c2a91e41122a62121ceeffe68f99784e0c2b5c5f8a7543180e8123e7398202ff
MD5 04f8d000385f624fabae87fe67b3cc35
BLAKE2b-256 53fb05421e3ac565c21c584b96d22361cc29fb493981f7d00c8d4a516e3f95d9

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.24.0.dev20220103232431-cp310-cp310-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.24.0.dev20220103232431-cp310-cp310-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 45d290e6883b4531887a716c4c4c3b381935e7d62668607bb5370cbdd799e5ed
MD5 4f6c296a535340d9068571d832819db2
BLAKE2b-256 58ff4836fb68739467f495a61bbb49f7fa5e952c60e7fe12c3c2317dc5d50018

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.24.0.dev20220103232431-cp310-cp310-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.24.0.dev20220103232431-cp310-cp310-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 22ce7e726b2adca2dbf613a595b40df9ab0343e3ccc2cadc0e0096eb80d5a580
MD5 07f51cb8ef8d843b3be49c870856cfcf
BLAKE2b-256 89bab446e2a048dbe6d2f0b6ab2019a27246df441d04c9837fe9681bd51acab8

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.24.0.dev20220103232431-cp39-cp39-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.24.0.dev20220103232431-cp39-cp39-win_amd64.whl
Algorithm Hash digest
SHA256 26a447771c38967d29eeebe8a0374d0ae521b6398f4916acd21afa88b17da2a5
MD5 ddd400da62ead86e528bb08a29144633
BLAKE2b-256 c7c61477a851236ed4531d163e960eccaf44e4cb8a8150c92310e4e869a8f3ed

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.24.0.dev20220103232431-cp39-cp39-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.24.0.dev20220103232431-cp39-cp39-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 bbf8afadbc3282d11285c3018a9eb45b8ed4aa83d0e8b476e8b77f28e2733b74
MD5 d607f298354c30e6cdf824cbd579bc55
BLAKE2b-256 ff645bc7f029e673f7312584dead008bcf304006963e972700162ee7e35621d6

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.24.0.dev20220103232431-cp39-cp39-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.24.0.dev20220103232431-cp39-cp39-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 bcfda2ca202748498147897d889cdd0417b8af29d74682d22c07911a9c4f615f
MD5 aa90972822409bce9d06e5fc8b13e746
BLAKE2b-256 d5d24570dda507ea664eb6df2d09a8b1831432e95cf3929a8ddebec88d3dcb1e

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.24.0.dev20220103232431-cp38-cp38-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.24.0.dev20220103232431-cp38-cp38-win_amd64.whl
Algorithm Hash digest
SHA256 ceb7a2180d7d86f08952c0b932142501967aef08647e388e5b0de26b2213c85a
MD5 a71c849727b6fdfbbfd609fb0da99376
BLAKE2b-256 9abaf46c87e5816aec7808319afd24ea716c395ba191932d1ec8a10493fcffab

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.24.0.dev20220103232431-cp38-cp38-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.24.0.dev20220103232431-cp38-cp38-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 7156fddd53b82a99c651c8fd10280a84ee2269c38f94b30ed65dc389be07ac96
MD5 8b2746ce3095fe26052c85da9f79ad86
BLAKE2b-256 3cc82876de69e0286640aef3d3d3dad64880dcdd9dbb2e7eb553c7b1e8c08abc

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.24.0.dev20220103232431-cp38-cp38-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.24.0.dev20220103232431-cp38-cp38-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 f37309c9c469147d588164312229dd62936bf2e533d1c0653ced1c0e21bfeadb
MD5 2149dd7aa88b8fbf8ff4846c1cf35c2d
BLAKE2b-256 09a244dc160b423ff5365ae7c64e74bfa2f3741fb69817228f845302a08c0eb8

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.24.0.dev20220103232431-cp37-cp37m-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.24.0.dev20220103232431-cp37-cp37m-win_amd64.whl
Algorithm Hash digest
SHA256 7c98ee12b97e59fb1a281c7673269ee89630421f55d119cd8687a984e7071299
MD5 77d5cea1e2922e8bc27bb8b7f3bdf894
BLAKE2b-256 45f4ef3bd83686a561aeeb3ea6f0469dd5d713e4cb45945029fbb22296118b63

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.24.0.dev20220103232431-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.24.0.dev20220103232431-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 93ea39d2f505fea7bf417678d6a98bd24947381050e5bc320e76757a628c3d83
MD5 479c5553d57b3761e32638b6800144bf
BLAKE2b-256 cbe863895b4d4a1cccc6bdb944b855dad5cf23936aabf024642bcedc014832bd

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.24.0.dev20220103232431-cp37-cp37m-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.24.0.dev20220103232431-cp37-cp37m-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 2ef106b21091f546de85e83c48d462135b67146f4854c02826472cd548a8f10e
MD5 9155846560e85b1f2482b37834c55ba0
BLAKE2b-256 ad4a8f92345d73735bb9db665f63e9ecb4c4e496dac159bab777a6c6865109b7

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page