Skip to main content

TensorFlow IO

Project description




TensorFlow I/O

GitHub CI PyPI License Documentation

TensorFlow I/O is a collection of file systems and file formats that are not available in TensorFlow's built-in support. A full list of supported file systems and file formats by TensorFlow I/O can be found here.

The use of tensorflow-io is straightforward with keras. Below is an example to Get Started with TensorFlow with the data processing aspect replaced by tensorflow-io:

import tensorflow as tf
import tensorflow_io as tfio

# Read the MNIST data into the IODataset.
dataset_url = "https://storage.googleapis.com/cvdf-datasets/mnist/"
d_train = tfio.IODataset.from_mnist(
    dataset_url + "train-images-idx3-ubyte.gz",
    dataset_url + "train-labels-idx1-ubyte.gz",
)

# Shuffle the elements of the dataset.
d_train = d_train.shuffle(buffer_size=1024)

# By default image data is uint8, so convert to float32 using map().
d_train = d_train.map(lambda x, y: (tf.image.convert_image_dtype(x, tf.float32), y))

# prepare batches the data just like any other tf.data.Dataset
d_train = d_train.batch(32)

# Build the model.
model = tf.keras.models.Sequential(
    [
        tf.keras.layers.Flatten(input_shape=(28, 28)),
        tf.keras.layers.Dense(512, activation=tf.nn.relu),
        tf.keras.layers.Dropout(0.2),
        tf.keras.layers.Dense(10, activation=tf.nn.softmax),
    ]
)

# Compile the model.
model.compile(
    optimizer="adam", loss="sparse_categorical_crossentropy", metrics=["accuracy"]
)

# Fit the model.
model.fit(d_train, epochs=5, steps_per_epoch=200)

In the above MNIST example, the URL's to access the dataset files are passed directly to the tfio.IODataset.from_mnist API call. This is due to the inherent support that tensorflow-io provides for HTTP/HTTPS file system, thus eliminating the need for downloading and saving datasets on a local directory.

NOTE: Since tensorflow-io is able to detect and uncompress the MNIST dataset automatically if needed, we can pass the URL's for the compressed files (gzip) to the API call as is.

Please check the official documentation for more detailed and interesting usages of the package.

Installation

Python Package

The tensorflow-io Python package can be installed with pip directly using:

$ pip install tensorflow-io

People who are a little more adventurous can also try our nightly binaries:

$ pip install tensorflow-io-nightly

Docker Images

In addition to the pip packages, the docker images can be used to quickly get started.

For stable builds:

$ docker pull tfsigio/tfio:latest
$ docker run -it --rm --name tfio-latest tfsigio/tfio:latest

For nightly builds:

$ docker pull tfsigio/tfio:nightly
$ docker run -it --rm --name tfio-nightly tfsigio/tfio:nightly

R Package

Once the tensorflow-io Python package has been successfully installed, you can install the development version of the R package from GitHub via the following:

if (!require("remotes")) install.packages("remotes")
remotes::install_github("tensorflow/io", subdir = "R-package")

TensorFlow Version Compatibility

To ensure compatibility with TensorFlow, it is recommended to install a matching version of TensorFlow I/O according to the table below. You can find the list of releases here.

TensorFlow I/O Version TensorFlow Compatibility Release Date
0.17.1 2.4.x Apr 16, 2021
0.17.0 2.4.x Dec 14, 2020
0.16.0 2.3.x Oct 23, 2020
0.15.0 2.3.x Aug 03, 2020
0.14.0 2.2.x Jul 08, 2020
0.13.0 2.2.x May 10, 2020
0.12.0 2.1.x Feb 28, 2020
0.11.0 2.1.x Jan 10, 2020
0.10.0 2.0.x Dec 05, 2019
0.9.1 2.0.x Nov 15, 2019
0.9.0 2.0.x Oct 18, 2019
0.8.1 1.15.x Nov 15, 2019
0.8.0 1.15.x Oct 17, 2019
0.7.2 1.14.x Nov 15, 2019
0.7.1 1.14.x Oct 18, 2019
0.7.0 1.14.x Jul 14, 2019
0.6.0 1.13.x May 29, 2019
0.5.0 1.13.x Apr 12, 2019
0.4.0 1.13.x Mar 01, 2019
0.3.0 1.12.0 Feb 15, 2019
0.2.0 1.12.0 Jan 29, 2019
0.1.0 1.12.0 Dec 16, 2018

Performance Benchmarking

We use github-pages to document the results of API performance benchmarks. The benchmark job is triggered on every commit to master branch and facilitates tracking performance w.r.t commits.

Contributing

Tensorflow I/O is a community led open source project. As such, the project depends on public contributions, bug-fixes, and documentation. Please see:

Build Status and CI

Build Status
Linux CPU Python 2 Status
Linux CPU Python 3 Status
Linux GPU Python 2 Status
Linux GPU Python 3 Status

Because of manylinux2010 requirement, TensorFlow I/O is built with Ubuntu:16.04 + Developer Toolset 7 (GCC 7.3) on Linux. Configuration with Ubuntu 16.04 with Developer Toolset 7 is not exactly straightforward. If the system have docker installed, then the following command will automatically build manylinux2010 compatible whl package:

#!/usr/bin/env bash

ls dist/*
for f in dist/*.whl; do
  docker run -i --rm -v $PWD:/v -w /v --net=host quay.io/pypa/manylinux2010_x86_64 bash -x -e /v/tools/build/auditwheel repair --plat manylinux2010_x86_64 $f
done
sudo chown -R $(id -nu):$(id -ng) .
ls wheelhouse/*

It takes some time to build, but once complete, there will be python 3.5, 3.6, 3.7 compatible whl packages available in wheelhouse directory.

On macOS, the same command could be used. However, the script expects python in shell and will only generate a whl package that matches the version of python in shell. If you want to build a whl package for a specific python then you have to alias this version of python to python in shell. See .github/workflows/build.yml Auditwheel step for instructions how to do that.

Note the above command is also the command we use when releasing packages for Linux and macOS.

TensorFlow I/O uses both GitHub Workflows and Google CI (Kokoro) for continuous integration. GitHub Workflows is used for macOS build and test. Kokoro is used for Linux build and test. Again, because of the manylinux2010 requirement, on Linux whl packages are always built with Ubuntu 16.04 + Developer Toolset 7. Tests are done on a variatiy of systems with different python3 versions to ensure a good coverage:

Python Ubuntu 18.04 Ubuntu 20.04 macOS + osx9 Windows-2019
2.7 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: N/A
3.7 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: :heavy_check_mark:
3.8 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: :heavy_check_mark:

TensorFlow I/O has integrations with many systems and cloud vendors such as Prometheus, Apache Kafka, Apache Ignite, Google Cloud PubSub, AWS Kinesis, Microsoft Azure Storage, Alibaba Cloud OSS etc.

We tried our best to test against those systems in our continuous integration whenever possible. Some tests such as Prometheus, Kafka, and Ignite are done with live systems, meaning we install Prometheus/Kafka/Ignite on CI machine before the test is run. Some tests such as Kinesis, PubSub, and Azure Storage are done through official or non-official emulators. Offline tests are also performed whenever possible, though systems covered through offine tests may not have the same level of coverage as live systems or emulators.

Live System Emulator CI Integration Offline
Apache Kafka :heavy_check_mark: :heavy_check_mark:
Apache Ignite :heavy_check_mark: :heavy_check_mark:
Prometheus :heavy_check_mark: :heavy_check_mark:
Google PubSub :heavy_check_mark: :heavy_check_mark:
Azure Storage :heavy_check_mark: :heavy_check_mark:
AWS Kinesis :heavy_check_mark: :heavy_check_mark:
Alibaba Cloud OSS :heavy_check_mark:
Google BigTable/BigQuery to be added
Elasticsearch (experimental) :heavy_check_mark: :heavy_check_mark:
MongoDB (experimental) :heavy_check_mark: :heavy_check_mark:

References for emulators:

Community

Additional Information

License

Apache License 2.0

Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

tensorflow_io_nightly-0.18.0.dev20210417081402-cp39-cp39-win_amd64.whl (21.3 MB view details)

Uploaded CPython 3.9 Windows x86-64

tensorflow_io_nightly-0.18.0.dev20210417081402-cp39-cp39-manylinux2010_x86_64.whl (24.6 MB view details)

Uploaded CPython 3.9 manylinux: glibc 2.12+ x86-64

tensorflow_io_nightly-0.18.0.dev20210417081402-cp39-cp39-macosx_10_14_x86_64.whl (21.3 MB view details)

Uploaded CPython 3.9 macOS 10.14+ x86-64

tensorflow_io_nightly-0.18.0.dev20210417081402-cp38-cp38-win_amd64.whl (21.3 MB view details)

Uploaded CPython 3.8 Windows x86-64

tensorflow_io_nightly-0.18.0.dev20210417081402-cp38-cp38-manylinux2010_x86_64.whl (24.6 MB view details)

Uploaded CPython 3.8 manylinux: glibc 2.12+ x86-64

tensorflow_io_nightly-0.18.0.dev20210417081402-cp38-cp38-macosx_10_14_x86_64.whl (21.3 MB view details)

Uploaded CPython 3.8 macOS 10.14+ x86-64

tensorflow_io_nightly-0.18.0.dev20210417081402-cp37-cp37m-win_amd64.whl (21.3 MB view details)

Uploaded CPython 3.7m Windows x86-64

tensorflow_io_nightly-0.18.0.dev20210417081402-cp37-cp37m-manylinux2010_x86_64.whl (24.6 MB view details)

Uploaded CPython 3.7m manylinux: glibc 2.12+ x86-64

tensorflow_io_nightly-0.18.0.dev20210417081402-cp37-cp37m-macosx_10_14_x86_64.whl (21.3 MB view details)

Uploaded CPython 3.7m macOS 10.14+ x86-64

tensorflow_io_nightly-0.18.0.dev20210417081402-cp36-cp36m-win_amd64.whl (21.3 MB view details)

Uploaded CPython 3.6m Windows x86-64

tensorflow_io_nightly-0.18.0.dev20210417081402-cp36-cp36m-manylinux2010_x86_64.whl (24.6 MB view details)

Uploaded CPython 3.6m manylinux: glibc 2.12+ x86-64

tensorflow_io_nightly-0.18.0.dev20210417081402-cp36-cp36m-macosx_10_14_x86_64.whl (21.3 MB view details)

Uploaded CPython 3.6m macOS 10.14+ x86-64

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210417081402-cp39-cp39-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210417081402-cp39-cp39-win_amd64.whl
Algorithm Hash digest
SHA256 b578a6916ebb4841570975f19af7f0261c985c3752e960e1ef76bddeaf12e0e5
MD5 fb13640771d287cc25e6d6b0f7460c7d
BLAKE2b-256 15edb06e14d378e21e3a2ae379225cad8e820e64c5bb53a4e70e7045819b84ec

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210417081402-cp39-cp39-manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210417081402-cp39-cp39-manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 a87e2d5a99a41f7ec09131315569b9236d8d2ba28746a2e8db3cf0b8618f5b77
MD5 8c8d06a44b69a66637ae03d5200db8ef
BLAKE2b-256 8a20136a2db063ac12d7d0f0dcc159d29486b956fc5825f33802c27c2981d727

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210417081402-cp39-cp39-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210417081402-cp39-cp39-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 b8e9cecbcdea8cae7c48f022c44cb4cbd8c7f909bcb4ae0ddc4c8eb11524c1b7
MD5 5dd9a1c2d0229d3d5833b87d2ecfa9ee
BLAKE2b-256 2e282503b99cabd90cc723b778d6b488ec9bf2824aef6290366640a8832cbfca

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210417081402-cp38-cp38-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210417081402-cp38-cp38-win_amd64.whl
Algorithm Hash digest
SHA256 10d794e726971f5ae5d9670c3378730842451dab359c32deb1f2d6bfc965dfcb
MD5 f4d4d1defe5645cc102c2a1f4a4d56e2
BLAKE2b-256 71a5ef44e768d0c293ee3c7146c631ee30e884fbeb003522db362400a1bfcef9

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210417081402-cp38-cp38-manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210417081402-cp38-cp38-manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 8da5e8926b3a1a3ecdc2ad1d29b72b5ca4c244f7be642a35081a7f2e03749e2b
MD5 1168026710ece57a94d817331bbf4d46
BLAKE2b-256 e87807fc8ddd75fbdd2093cb34d3257ce158d9c143d22ce4ae88991f3f5fdcec

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210417081402-cp38-cp38-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210417081402-cp38-cp38-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 6ce62364f738546828451c547ec010b13bf4028d2640bb2b1aa70cc79d201723
MD5 afb1852f76eec06f13b6d7971bad50e0
BLAKE2b-256 1e460cd516daed3fb2a5c6356512e15282e8d2a55082fe0da3caf797561999d1

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210417081402-cp37-cp37m-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210417081402-cp37-cp37m-win_amd64.whl
Algorithm Hash digest
SHA256 2ab0e9486f996a6a35c197c250748556dce779358827205220c2b7cd5f7e8afe
MD5 b863d53162fecf236b1a6beedb923429
BLAKE2b-256 1e34d22311a7a0e52df81d8215513b5f67f0de7a90e200f6f83aef47c97c403f

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210417081402-cp37-cp37m-manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210417081402-cp37-cp37m-manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 ba8a0336f15ec5a886610cd9179bdbfa9176938f76c1ce3328ecab2b0d8fc1eb
MD5 ed846f24dbff31e104ee6e40207d4a53
BLAKE2b-256 54e180719fd8e2a3c702020a664443b10a406e4ce0bdfdee545b1970f6d8db2b

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210417081402-cp37-cp37m-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210417081402-cp37-cp37m-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 713ba8b5b8f6d7b90772badd8f189fea139a8e46243ba1cb6c263cdb33cf832d
MD5 c194d91647af7326f62690499068e601
BLAKE2b-256 ef94f8026e900bd11b56fd4582e3a77d864e9e6b4f1d05e3b031a96df8c158d6

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210417081402-cp36-cp36m-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210417081402-cp36-cp36m-win_amd64.whl
Algorithm Hash digest
SHA256 62adaab17cde56d83a8ff16ae53455e3d91e8b441283970bc7b18db6bb8ec3b5
MD5 6686c76c7fdaf74fb25685231a032ba5
BLAKE2b-256 14be223c6495e7a8a8582238cadf58652e503909900b01ee028750aeb0337919

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210417081402-cp36-cp36m-manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210417081402-cp36-cp36m-manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 581eaa435d64ee08de4a608ca586533a41fd5b3e85102f8fe04ccf213f61b020
MD5 7ad80f1e72919f953a9cf6ec0fe93505
BLAKE2b-256 2cd1b6ff4d94d7af44b9271226c48b2e024529b549cb129a6ebb4a28dcd65ec7

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.18.0.dev20210417081402-cp36-cp36m-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.18.0.dev20210417081402-cp36-cp36m-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 c3d93f003eda75679af835d4aadf9302d2bdc7494b9441cce1a13dc46be0e466
MD5 03c5cb59abc380a51964e055c73e9511
BLAKE2b-256 6fc87550dba4b8d568826fa74e2bbd0c45980275940be07c8d4be9f5706b54b2

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page