Skip to main content

TensorFlow IO

Project description




TensorFlow I/O

GitHub CI PyPI License Documentation

TensorFlow I/O is a collection of file systems and file formats that are not available in TensorFlow's built-in support. A full list of supported file systems and file formats by TensorFlow I/O can be found here.

The use of tensorflow-io is straightforward with keras. Below is an example to Get Started with TensorFlow with the data processing aspect replaced by tensorflow-io:

import tensorflow as tf
import tensorflow_io as tfio

# Read the MNIST data into the IODataset.
d_train = tfio.IODataset.from_mnist(
    'http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz',
    'http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz')

# Shuffle the elements of the dataset.
d_train = d_train.shuffle(buffer_size=1024)

# By default image data is uint8, so convert to float32 using map().
d_train = d_train.map(lambda x, y: (tf.image.convert_image_dtype(x, tf.float32), y))

# prepare batches the data just like any other tf.data.Dataset
d_train = d_train.batch(32)

# Build the model.
model = tf.keras.models.Sequential([
  tf.keras.layers.Flatten(input_shape=(28, 28)),
  tf.keras.layers.Dense(512, activation=tf.nn.relu),
  tf.keras.layers.Dropout(0.2),
  tf.keras.layers.Dense(10, activation=tf.nn.softmax)
])

# Compile the model.
model.compile(optimizer='adam',
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])

# Fit the model.
model.fit(d_train, epochs=5, steps_per_epoch=200)

In the above MNIST example, the URL's to access the dataset files are passed directly to the tfio.IODataset.from_mnist API call. This is due to the inherent support that tensorflow-io provides for the HTTP file system, thus eliminating the need for downloading and saving datasets on a local directory.

NOTE: Since tensorflow-io is able to detect and uncompress the MNIST dataset automatically if needed, we can pass the URL's for the compressed files (gzip) to the API call as is.

Please check the official documentation for more detailed and interesting usages of the package.

Installation

Python Package

The tensorflow-io Python package can be installed with pip directly using:

$ pip install tensorflow-io

People who are a little more adventurous can also try our nightly binaries:

$ pip install tensorflow-io-nightly

Docker Images

In addition to the pip packages, the docker images can be used to quickly get started.

For stable builds:

$ docker pull tfsigio/tfio:latest
$ docker run -it --rm --name tfio-latest tfsigio/tfio:latest

For nightly builds:

$ docker pull tfsigio/tfio:nightly
$ docker run -it --rm --name tfio-nightly tfsigio/tfio:nightly

R Package

Once the tensorflow-io Python package has been successfully installed, you can install the development version of the R package from GitHub via the following:

if (!require("remotes")) install.packages("remotes")
remotes::install_github("tensorflow/io", subdir = "R-package")

TensorFlow Version Compatibility

To ensure compatibility with TensorFlow, it is recommended to install a matching version of TensorFlow I/O according to the table below. You can find the list of releases here.

TensorFlow I/O Version TensorFlow Compatibility Release Date
0.17.0 2.4.x Dec 14, 2020
0.16.0 2.3.x Oct 23, 2020
0.15.0 2.3.x Aug 03, 2020
0.14.0 2.2.x Jul 08, 2020
0.13.0 2.2.x May 10, 2020
0.12.0 2.1.x Feb 28, 2020
0.11.0 2.1.x Jan 10, 2020
0.10.0 2.0.x Dec 05, 2019
0.9.1 2.0.x Nov 15, 2019
0.9.0 2.0.x Oct 18, 2019
0.8.1 1.15.x Nov 15, 2019
0.8.0 1.15.x Oct 17, 2019
0.7.2 1.14.x Nov 15, 2019
0.7.1 1.14.x Oct 18, 2019
0.7.0 1.14.x Jul 14, 2019
0.6.0 1.13.x May 29, 2019
0.5.0 1.13.x Apr 12, 2019
0.4.0 1.13.x Mar 01, 2019
0.3.0 1.12.0 Feb 15, 2019
0.2.0 1.12.0 Jan 29, 2019
0.1.0 1.12.0 Dec 16, 2018

Contributing

Tensorflow I/O is a community led open source project. As such, the project depends on public contributions, bug-fixes, and documentation. Please see:

Build Status and CI

Build Status
Linux CPU Python 2 Status
Linux CPU Python 3 Status
Linux GPU Python 2 Status
Linux GPU Python 3 Status

Because of manylinux2010 requirement, TensorFlow I/O is built with Ubuntu:16.04 + Developer Toolset 7 (GCC 7.3) on Linux. Configuration with Ubuntu 16.04 with Developer Toolset 7 is not exactly straightforward. If the system have docker installed, then the following command will automatically build manylinux2010 compatible whl package:

#!/usr/bin/env bash

ls dist/*
for f in dist/*.whl; do
  docker run -i --rm -v $PWD:/v -w /v --net=host quay.io/pypa/manylinux2010_x86_64 bash -x -e /v/tools/build/auditwheel repair --plat manylinux2010_x86_64 $f
done
sudo chown -R $(id -nu):$(id -ng) .
ls wheelhouse/*

It takes some time to build, but once complete, there will be python 3.5, 3.6, 3.7 compatible whl packages available in wheelhouse directory.

On macOS, the same command could be used. However, the script expects python in shell and will only generate a whl package that matches the version of python in shell. If you want to build a whl package for a specific python then you have to alias this version of python to python in shell. See .github/workflows/build.yml Auditwheel step for instructions how to do that.

Note the above command is also the command we use when releasing packages for Linux and macOS.

TensorFlow I/O uses both GitHub Workflows and Google CI (Kokoro) for continuous integration. GitHub Workflows is used for macOS build and test. Kokoro is used for Linux build and test. Again, because of the manylinux2010 requirement, on Linux whl packages are always built with Ubuntu 16.04 + Developer Toolset 7. Tests are done on a variatiy of systems with different python3 versions to ensure a good coverage:

Python Ubuntu 18.04 Ubuntu 20.04 macOS + osx9 Windows-2019
2.7 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: N/A
3.7 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: :heavy_check_mark:
3.8 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: :heavy_check_mark:

TensorFlow I/O has integrations with many systems and cloud vendors such as Prometheus, Apache Kafka, Apache Ignite, Google Cloud PubSub, AWS Kinesis, Microsoft Azure Storage, Alibaba Cloud OSS etc.

We tried our best to test against those systems in our continuous integration whenever possible. Some tests such as Prometheus, Kafka, and Ignite are done with live systems, meaning we install Prometheus/Kafka/Ignite on CI machine before the test is run. Some tests such as Kinesis, PubSub, and Azure Storage are done through official or non-official emulators. Offline tests are also performed whenever possible, though systems covered through offine tests may not have the same level of coverage as live systems or emulators.

Live System Emulator CI Integration Offline
Apache Kafka :heavy_check_mark: :heavy_check_mark:
Apache Ignite :heavy_check_mark: :heavy_check_mark:
Prometheus :heavy_check_mark: :heavy_check_mark:
Google PubSub :heavy_check_mark: :heavy_check_mark:
Azure Storage :heavy_check_mark: :heavy_check_mark:
AWS Kinesis :heavy_check_mark: :heavy_check_mark:
Alibaba Cloud OSS :heavy_check_mark:
Google BigTable/BigQuery to be added
Elasticsearch (experimental) :heavy_check_mark: :heavy_check_mark:
MongoDB (experimental) :heavy_check_mark: :heavy_check_mark:

References for emulators:

Community

Additional Information

License

Apache License 2.0

Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

tensorflow_io_nightly-0.17.0.dev20210130040526-cp38-cp38-win_amd64.whl (21.1 MB view details)

Uploaded CPython 3.8 Windows x86-64

tensorflow_io_nightly-0.17.0.dev20210130040526-cp38-cp38-manylinux2010_x86_64.whl (25.5 MB view details)

Uploaded CPython 3.8 manylinux: glibc 2.12+ x86-64

tensorflow_io_nightly-0.17.0.dev20210130040526-cp38-cp38-macosx_10_13_x86_64.whl (21.5 MB view details)

Uploaded CPython 3.8 macOS 10.13+ x86-64

tensorflow_io_nightly-0.17.0.dev20210130040526-cp37-cp37m-win_amd64.whl (21.1 MB view details)

Uploaded CPython 3.7m Windows x86-64

tensorflow_io_nightly-0.17.0.dev20210130040526-cp37-cp37m-manylinux2010_x86_64.whl (25.5 MB view details)

Uploaded CPython 3.7m manylinux: glibc 2.12+ x86-64

tensorflow_io_nightly-0.17.0.dev20210130040526-cp37-cp37m-macosx_10_13_x86_64.whl (21.5 MB view details)

Uploaded CPython 3.7m macOS 10.13+ x86-64

tensorflow_io_nightly-0.17.0.dev20210130040526-cp36-cp36m-win_amd64.whl (21.1 MB view details)

Uploaded CPython 3.6m Windows x86-64

tensorflow_io_nightly-0.17.0.dev20210130040526-cp36-cp36m-manylinux2010_x86_64.whl (25.5 MB view details)

Uploaded CPython 3.6m manylinux: glibc 2.12+ x86-64

tensorflow_io_nightly-0.17.0.dev20210130040526-cp36-cp36m-macosx_10_13_x86_64.whl (21.5 MB view details)

Uploaded CPython 3.6m macOS 10.13+ x86-64

File details

Details for the file tensorflow_io_nightly-0.17.0.dev20210130040526-cp38-cp38-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.17.0.dev20210130040526-cp38-cp38-win_amd64.whl
Algorithm Hash digest
SHA256 0fb73697d1a87edce56c13ff9591f4f4abf4dfb1b3b2a45ca10a8323424c7b9d
MD5 5ba9b9e6c0d2244738adbe05e6e0ce44
BLAKE2b-256 28279ab476402001b5bb0f6f1c46d1c93a98f8b46ed61ec22899752a2b109e8a

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.17.0.dev20210130040526-cp38-cp38-manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.17.0.dev20210130040526-cp38-cp38-manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 4529de65d45f75ae09383125ef5c8d75c39220ccf9c23e844dc39c6e5c9146f2
MD5 19ee00019bff40ea8ba62e18c576a625
BLAKE2b-256 95329f19bf244ccb8cede135bafb0ed5271306bd4a945a77198ccacd06a18ca0

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.17.0.dev20210130040526-cp38-cp38-macosx_10_13_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.17.0.dev20210130040526-cp38-cp38-macosx_10_13_x86_64.whl
Algorithm Hash digest
SHA256 44339bccd60a8e81ae3c5d07acf4ebca128508130688d98e08d4ea9330c410c9
MD5 6564c39feed3cb1cf252a9029a7f7596
BLAKE2b-256 a1aa41cc573e952fa25eb347cd2a0134b48b7607ad2c9dbc01139b0c4b870935

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.17.0.dev20210130040526-cp37-cp37m-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.17.0.dev20210130040526-cp37-cp37m-win_amd64.whl
Algorithm Hash digest
SHA256 bd7a33ccb0562ab8feef56e1d57d2da67165c9155690169fa3a2361d30edde6d
MD5 eaa9d874a07b00a59de0e1d6857c2705
BLAKE2b-256 f363c349f140f8a4fbf6ac1da01fc66fba5d36cfe48279f0cd7be0ee15b20274

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.17.0.dev20210130040526-cp37-cp37m-manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.17.0.dev20210130040526-cp37-cp37m-manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 033cbd299d3577d1c8edcb2f069448f0a290f28399fbb21cb44c956797aa0ea2
MD5 172e6b03ec86efb9cf879ae66da3ad19
BLAKE2b-256 54706bb674ab4c0eaccc74f47c8dd400bd53115aae5a0e11f71d6168ca99cc95

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.17.0.dev20210130040526-cp37-cp37m-macosx_10_13_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.17.0.dev20210130040526-cp37-cp37m-macosx_10_13_x86_64.whl
Algorithm Hash digest
SHA256 9d08dd78674aae1850977be740ff2d298b6eb6fea6dd68afc64e7445c35f2879
MD5 51efd51f32aea5060cbeb238e4b7e327
BLAKE2b-256 0ff9253f7215d941d876b1d041519bf42e7df524e784f990ca0e095a16485072

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.17.0.dev20210130040526-cp36-cp36m-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.17.0.dev20210130040526-cp36-cp36m-win_amd64.whl
Algorithm Hash digest
SHA256 2cc4716c23c17cc3499650953d3d1d150d64e4a202afa7a10c12e7e823ce73e4
MD5 90849b115b134c1d82da94562a305e42
BLAKE2b-256 7ab1e8059a2d76f16036e89bc75e7c3d79d21b04ae216ba826b8f7f63263be7a

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.17.0.dev20210130040526-cp36-cp36m-manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.17.0.dev20210130040526-cp36-cp36m-manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 f6d100694ae4d433e86fe66fa332c8da636c18d9297c330b7fe425633b984661
MD5 ef3eed73664fc70fc4f272c764bce05a
BLAKE2b-256 838dd3e11868aa1cf6b3b1293d47cfd03cbe0ba616a914163f738d84b1409795

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.17.0.dev20210130040526-cp36-cp36m-macosx_10_13_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.17.0.dev20210130040526-cp36-cp36m-macosx_10_13_x86_64.whl
Algorithm Hash digest
SHA256 ea6a16a04ee8407fe9759f1653b70ee8870043eff73f2395c8f8dcd7d98940d0
MD5 1f92b33a8241652c31912e800a4c054f
BLAKE2b-256 e614688e0cb8929249524883f551ac0d3965a72e7623ae7be388324a55ee0f23

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page