Skip to main content

TensorFlow IO

Project description




TensorFlow I/O

GitHub CI PyPI License Documentation

TensorFlow I/O is a collection of file systems and file formats that are not available in TensorFlow's built-in support. A full list of supported file systems and file formats by TensorFlow I/O can be found here.

The use of tensorflow-io is straightforward with keras. Below is an example to Get Started with TensorFlow with the data processing aspect replaced by tensorflow-io:

import tensorflow as tf
import tensorflow_io as tfio

# Read the MNIST data into the IODataset.
dataset_url = "https://storage.googleapis.com/cvdf-datasets/mnist/"
d_train = tfio.IODataset.from_mnist(
    dataset_url + "train-images-idx3-ubyte.gz",
    dataset_url + "train-labels-idx1-ubyte.gz",
)

# Shuffle the elements of the dataset.
d_train = d_train.shuffle(buffer_size=1024)

# By default image data is uint8, so convert to float32 using map().
d_train = d_train.map(lambda x, y: (tf.image.convert_image_dtype(x, tf.float32), y))

# prepare batches the data just like any other tf.data.Dataset
d_train = d_train.batch(32)

# Build the model.
model = tf.keras.models.Sequential(
    [
        tf.keras.layers.Flatten(input_shape=(28, 28)),
        tf.keras.layers.Dense(512, activation=tf.nn.relu),
        tf.keras.layers.Dropout(0.2),
        tf.keras.layers.Dense(10, activation=tf.nn.softmax),
    ]
)

# Compile the model.
model.compile(
    optimizer="adam", loss="sparse_categorical_crossentropy", metrics=["accuracy"]
)

# Fit the model.
model.fit(d_train, epochs=5, steps_per_epoch=200)

In the above MNIST example, the URL's to access the dataset files are passed directly to the tfio.IODataset.from_mnist API call. This is due to the inherent support that tensorflow-io provides for HTTP/HTTPS file system, thus eliminating the need for downloading and saving datasets on a local directory.

NOTE: Since tensorflow-io is able to detect and uncompress the MNIST dataset automatically if needed, we can pass the URL's for the compressed files (gzip) to the API call as is.

Please check the official documentation for more detailed and interesting usages of the package.

Installation

Python Package

The tensorflow-io Python package can be installed with pip directly using:

$ pip install tensorflow-io

People who are a little more adventurous can also try our nightly binaries:

$ pip install tensorflow-io-nightly

To ensure you have a version of TensorFlow that is compatible with TensorFlow-IO, you can specify the tensorflow extra requirement during install:

pip install tensorflow-io[tensorflow]

Similar extras exist for the tensorflow-gpu, tensorflow-cpu and tensorflow-rocm packages.

Docker Images

In addition to the pip packages, the docker images can be used to quickly get started.

For stable builds:

$ docker pull tfsigio/tfio:latest
$ docker run -it --rm --name tfio-latest tfsigio/tfio:latest

For nightly builds:

$ docker pull tfsigio/tfio:nightly
$ docker run -it --rm --name tfio-nightly tfsigio/tfio:nightly

R Package

Once the tensorflow-io Python package has been successfully installed, you can install the development version of the R package from GitHub via the following:

if (!require("remotes")) install.packages("remotes")
remotes::install_github("tensorflow/io", subdir = "R-package")

TensorFlow Version Compatibility

To ensure compatibility with TensorFlow, it is recommended to install a matching version of TensorFlow I/O according to the table below. You can find the list of releases here.

TensorFlow I/O Version TensorFlow Compatibility Release Date
0.23.1 2.7.x Dec 15, 2021
0.23.0 2.7.x Dec 14, 2021
0.22.0 2.7.x Nov 10, 2021
0.21.0 2.6.x Sep 12, 2021
0.20.0 2.6.x Aug 11, 2021
0.19.1 2.5.x Jul 25, 2021
0.19.0 2.5.x Jun 25, 2021
0.18.0 2.5.x May 13, 2021
0.17.1 2.4.x Apr 16, 2021
0.17.0 2.4.x Dec 14, 2020
0.16.0 2.3.x Oct 23, 2020
0.15.0 2.3.x Aug 03, 2020
0.14.0 2.2.x Jul 08, 2020
0.13.0 2.2.x May 10, 2020
0.12.0 2.1.x Feb 28, 2020
0.11.0 2.1.x Jan 10, 2020
0.10.0 2.0.x Dec 05, 2019
0.9.1 2.0.x Nov 15, 2019
0.9.0 2.0.x Oct 18, 2019
0.8.1 1.15.x Nov 15, 2019
0.8.0 1.15.x Oct 17, 2019
0.7.2 1.14.x Nov 15, 2019
0.7.1 1.14.x Oct 18, 2019
0.7.0 1.14.x Jul 14, 2019
0.6.0 1.13.x May 29, 2019
0.5.0 1.13.x Apr 12, 2019
0.4.0 1.13.x Mar 01, 2019
0.3.0 1.12.0 Feb 15, 2019
0.2.0 1.12.0 Jan 29, 2019
0.1.0 1.12.0 Dec 16, 2018

Performance Benchmarking

We use github-pages to document the results of API performance benchmarks. The benchmark job is triggered on every commit to master branch and facilitates tracking performance w.r.t commits.

Contributing

Tensorflow I/O is a community led open source project. As such, the project depends on public contributions, bug-fixes, and documentation. Please see:

Build Status and CI

Build Status
Linux CPU Python 2 Status
Linux CPU Python 3 Status
Linux GPU Python 2 Status
Linux GPU Python 3 Status

Because of manylinux2010 requirement, TensorFlow I/O is built with Ubuntu:16.04 + Developer Toolset 7 (GCC 7.3) on Linux. Configuration with Ubuntu 16.04 with Developer Toolset 7 is not exactly straightforward. If the system have docker installed, then the following command will automatically build manylinux2010 compatible whl package:

#!/usr/bin/env bash

ls dist/*
for f in dist/*.whl; do
  docker run -i --rm -v $PWD:/v -w /v --net=host quay.io/pypa/manylinux2010_x86_64 bash -x -e /v/tools/build/auditwheel repair --plat manylinux2010_x86_64 $f
done
sudo chown -R $(id -nu):$(id -ng) .
ls wheelhouse/*

It takes some time to build, but once complete, there will be python 3.5, 3.6, 3.7 compatible whl packages available in wheelhouse directory.

On macOS, the same command could be used. However, the script expects python in shell and will only generate a whl package that matches the version of python in shell. If you want to build a whl package for a specific python then you have to alias this version of python to python in shell. See .github/workflows/build.yml Auditwheel step for instructions how to do that.

Note the above command is also the command we use when releasing packages for Linux and macOS.

TensorFlow I/O uses both GitHub Workflows and Google CI (Kokoro) for continuous integration. GitHub Workflows is used for macOS build and test. Kokoro is used for Linux build and test. Again, because of the manylinux2010 requirement, on Linux whl packages are always built with Ubuntu 16.04 + Developer Toolset 7. Tests are done on a variatiy of systems with different python3 versions to ensure a good coverage:

Python Ubuntu 18.04 Ubuntu 20.04 macOS + osx9 Windows-2019
2.7 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: N/A
3.7 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: :heavy_check_mark:
3.8 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: :heavy_check_mark:

TensorFlow I/O has integrations with many systems and cloud vendors such as Prometheus, Apache Kafka, Apache Ignite, Google Cloud PubSub, AWS Kinesis, Microsoft Azure Storage, Alibaba Cloud OSS etc.

We tried our best to test against those systems in our continuous integration whenever possible. Some tests such as Prometheus, Kafka, and Ignite are done with live systems, meaning we install Prometheus/Kafka/Ignite on CI machine before the test is run. Some tests such as Kinesis, PubSub, and Azure Storage are done through official or non-official emulators. Offline tests are also performed whenever possible, though systems covered through offine tests may not have the same level of coverage as live systems or emulators.

Live System Emulator CI Integration Offline
Apache Kafka :heavy_check_mark: :heavy_check_mark:
Apache Ignite :heavy_check_mark: :heavy_check_mark:
Prometheus :heavy_check_mark: :heavy_check_mark:
Google PubSub :heavy_check_mark: :heavy_check_mark:
Azure Storage :heavy_check_mark: :heavy_check_mark:
AWS Kinesis :heavy_check_mark: :heavy_check_mark:
Alibaba Cloud OSS :heavy_check_mark:
Google BigTable/BigQuery to be added
Elasticsearch (experimental) :heavy_check_mark: :heavy_check_mark:
MongoDB (experimental) :heavy_check_mark: :heavy_check_mark:

References for emulators:

Community

Additional Information

License

Apache License 2.0

Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

tensorflow_io_nightly-0.24.0.dev20220104210604-cp310-cp310-win_amd64.whl (21.4 MB view details)

Uploaded CPython 3.10 Windows x86-64

tensorflow_io_nightly-0.24.0.dev20220104210604-cp310-cp310-macosx_10_14_x86_64.whl (23.8 MB view details)

Uploaded CPython 3.10 macOS 10.14+ x86-64

tensorflow_io_nightly-0.24.0.dev20220104210604-cp39-cp39-win_amd64.whl (21.4 MB view details)

Uploaded CPython 3.9 Windows x86-64

tensorflow_io_nightly-0.24.0.dev20220104210604-cp39-cp39-macosx_10_14_x86_64.whl (23.8 MB view details)

Uploaded CPython 3.9 macOS 10.14+ x86-64

tensorflow_io_nightly-0.24.0.dev20220104210604-cp38-cp38-win_amd64.whl (21.4 MB view details)

Uploaded CPython 3.8 Windows x86-64

tensorflow_io_nightly-0.24.0.dev20220104210604-cp38-cp38-macosx_10_14_x86_64.whl (23.8 MB view details)

Uploaded CPython 3.8 macOS 10.14+ x86-64

tensorflow_io_nightly-0.24.0.dev20220104210604-cp37-cp37m-win_amd64.whl (21.4 MB view details)

Uploaded CPython 3.7m Windows x86-64

tensorflow_io_nightly-0.24.0.dev20220104210604-cp37-cp37m-macosx_10_14_x86_64.whl (23.8 MB view details)

Uploaded CPython 3.7m macOS 10.14+ x86-64

File details

Details for the file tensorflow_io_nightly-0.24.0.dev20220104210604-cp310-cp310-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.24.0.dev20220104210604-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 2dafb9da4f700f779104724cccb4b04f812d5f8594590013a85870cf24738c31
MD5 ff21b237f9df9d9c34bd72ad564afb7e
BLAKE2b-256 4ea790761af0f8466b4826bc9c0df27f883d868173985b1faf04f4354c39b2ea

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.24.0.dev20220104210604-cp310-cp310-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.24.0.dev20220104210604-cp310-cp310-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 620e525d95dfad357f0338781ee4e352fae2f03ad829d7018348733ea3770588
MD5 1cf34b7c2bc8d377ad36c95c878619b5
BLAKE2b-256 8091ef7b197e1bd62755e15dd5156d337d89f2d626b9ea872ac6e316f59fa7e9

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.24.0.dev20220104210604-cp310-cp310-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.24.0.dev20220104210604-cp310-cp310-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 be2c81602a6d48869cf4a3a5f3380d41a3c464069f9510a55792873f0d0df1ad
MD5 777b4bc6f14b34062e0d9891cdf56b7b
BLAKE2b-256 1309393553d204a225aa9f14ddc50651c6174b88669a8bd3599f51237cf90c28

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.24.0.dev20220104210604-cp39-cp39-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.24.0.dev20220104210604-cp39-cp39-win_amd64.whl
Algorithm Hash digest
SHA256 0323b2b552d7794d0af5ee9215eeba33cb9fd59dce7026ce4dc727fd5c9811a0
MD5 bae92c3944e78ac32c5584eebd7a019f
BLAKE2b-256 2b3e909e3c5ce10d13074fd227dc6ae19abb20df2b3493e3f8f4ef26db796361

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.24.0.dev20220104210604-cp39-cp39-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.24.0.dev20220104210604-cp39-cp39-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 395813d84278cb8bfdb54a3cef959ed13fbae1311e99480841e5d65e05fe3dce
MD5 a4045e82785bcce97a1122b1ec7041ce
BLAKE2b-256 7b752b079525e49d947291ee8e573873608a480de0ed2b0b2a3dfbecb4ecbad2

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.24.0.dev20220104210604-cp39-cp39-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.24.0.dev20220104210604-cp39-cp39-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 6f6c25c5d3ccc14276f1d13689d3636e845a4c1435c8136befc2111e6bf148be
MD5 14faa207c3cd0df03c95c5fa16703eac
BLAKE2b-256 0f3cb256c149009d948fb2fb955fa8b6d94d09024bed75205fa17050f642231a

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.24.0.dev20220104210604-cp38-cp38-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.24.0.dev20220104210604-cp38-cp38-win_amd64.whl
Algorithm Hash digest
SHA256 d8c4458a113fae0d079a32289cee74b2466cb2ba720d665f82f1ee6647e23ed3
MD5 b424e4d23eaf7ea84d2cdcca88a812a7
BLAKE2b-256 59a3dea20d52bde6580e73fa0b5309c570fee00b20c9d90e436f75c70455e5e8

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.24.0.dev20220104210604-cp38-cp38-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.24.0.dev20220104210604-cp38-cp38-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 10e99872a979eba3dd1d6110775e8710c8f528bde588915781d488fa934703bd
MD5 f7b07e340959fdd0617633fa6f3c16eb
BLAKE2b-256 d7abf44b79130a0ac112e452c8e2b7be497d55f4070896d02c3f21b86de35d43

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.24.0.dev20220104210604-cp38-cp38-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.24.0.dev20220104210604-cp38-cp38-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 a3daafdf1baadeaac580a5e76f5fd02a8b6853202a0e28118e11108ea96fe48c
MD5 71acad55221bec191fc6b4bdb1737212
BLAKE2b-256 a77f85a8e59132231c47e6435ac325892a8acd13a65ecf2b1cdc1a33730a0793

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.24.0.dev20220104210604-cp37-cp37m-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.24.0.dev20220104210604-cp37-cp37m-win_amd64.whl
Algorithm Hash digest
SHA256 6d6ccf443e9f92c2bf427903a7bc5735d60771a0df060cc6376aaf332a7ac9f6
MD5 5b286cb5b510dcf78b78b5c34f47f665
BLAKE2b-256 05b2b294e3cc3c5e0a0ba8980f0c72dbee4744a6dcd1284dd9b48c64407fe11d

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.24.0.dev20220104210604-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.24.0.dev20220104210604-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 efefa6918c21be8a2f06331c17222cea513799080862282d2a51a701a62697da
MD5 00afdc18bff92413729208cae9ec43a4
BLAKE2b-256 843347cc5e038e89868269f934465c43dc6078af45be867aa98afee366cb32bc

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.24.0.dev20220104210604-cp37-cp37m-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.24.0.dev20220104210604-cp37-cp37m-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 28dbd99107dfadaea2989ce442303718eb5ceb641213c2f7047ad606b95e497d
MD5 11cb9670c051bea2ac7abd9b09a325d9
BLAKE2b-256 7f37f2974ba4ae63811f2a22f031ff85dfb3310645a46717be2915a3a15f999b

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page