Skip to main content

TensorFlow IO

Project description




TensorFlow I/O

GitHub CI PyPI License Documentation

TensorFlow I/O is a collection of file systems and file formats that are not available in TensorFlow's built-in support. A full list of supported file systems and file formats by TensorFlow I/O can be found here.

The use of tensorflow-io is straightforward with keras. Below is an example to Get Started with TensorFlow with the data processing aspect replaced by tensorflow-io:

import tensorflow as tf
import tensorflow_io as tfio

# Read the MNIST data into the IODataset.
dataset_url = "https://storage.googleapis.com/cvdf-datasets/mnist/"
d_train = tfio.IODataset.from_mnist(
    dataset_url + "train-images-idx3-ubyte.gz",
    dataset_url + "train-labels-idx1-ubyte.gz",
)

# Shuffle the elements of the dataset.
d_train = d_train.shuffle(buffer_size=1024)

# By default image data is uint8, so convert to float32 using map().
d_train = d_train.map(lambda x, y: (tf.image.convert_image_dtype(x, tf.float32), y))

# prepare batches the data just like any other tf.data.Dataset
d_train = d_train.batch(32)

# Build the model.
model = tf.keras.models.Sequential(
    [
        tf.keras.layers.Flatten(input_shape=(28, 28)),
        tf.keras.layers.Dense(512, activation=tf.nn.relu),
        tf.keras.layers.Dropout(0.2),
        tf.keras.layers.Dense(10, activation=tf.nn.softmax),
    ]
)

# Compile the model.
model.compile(
    optimizer="adam", loss="sparse_categorical_crossentropy", metrics=["accuracy"]
)

# Fit the model.
model.fit(d_train, epochs=5, steps_per_epoch=200)

In the above MNIST example, the URL's to access the dataset files are passed directly to the tfio.IODataset.from_mnist API call. This is due to the inherent support that tensorflow-io provides for HTTP/HTTPS file system, thus eliminating the need for downloading and saving datasets on a local directory.

NOTE: Since tensorflow-io is able to detect and uncompress the MNIST dataset automatically if needed, we can pass the URL's for the compressed files (gzip) to the API call as is.

Please check the official documentation for more detailed and interesting usages of the package.

Installation

Python Package

The tensorflow-io Python package can be installed with pip directly using:

$ pip install tensorflow-io

People who are a little more adventurous can also try our nightly binaries:

$ pip install tensorflow-io-nightly

Docker Images

In addition to the pip packages, the docker images can be used to quickly get started.

For stable builds:

$ docker pull tfsigio/tfio:latest
$ docker run -it --rm --name tfio-latest tfsigio/tfio:latest

For nightly builds:

$ docker pull tfsigio/tfio:nightly
$ docker run -it --rm --name tfio-nightly tfsigio/tfio:nightly

R Package

Once the tensorflow-io Python package has been successfully installed, you can install the development version of the R package from GitHub via the following:

if (!require("remotes")) install.packages("remotes")
remotes::install_github("tensorflow/io", subdir = "R-package")

TensorFlow Version Compatibility

To ensure compatibility with TensorFlow, it is recommended to install a matching version of TensorFlow I/O according to the table below. You can find the list of releases here.

TensorFlow I/O Version TensorFlow Compatibility Release Date
0.19.1 2.5.x Jul 25, 2021
0.19.0 2.5.x Jun 25, 2021
0.18.0 2.5.x May 13, 2021
0.17.1 2.4.x Apr 16, 2021
0.17.0 2.4.x Dec 14, 2020
0.16.0 2.3.x Oct 23, 2020
0.15.0 2.3.x Aug 03, 2020
0.14.0 2.2.x Jul 08, 2020
0.13.0 2.2.x May 10, 2020
0.12.0 2.1.x Feb 28, 2020
0.11.0 2.1.x Jan 10, 2020
0.10.0 2.0.x Dec 05, 2019
0.9.1 2.0.x Nov 15, 2019
0.9.0 2.0.x Oct 18, 2019
0.8.1 1.15.x Nov 15, 2019
0.8.0 1.15.x Oct 17, 2019
0.7.2 1.14.x Nov 15, 2019
0.7.1 1.14.x Oct 18, 2019
0.7.0 1.14.x Jul 14, 2019
0.6.0 1.13.x May 29, 2019
0.5.0 1.13.x Apr 12, 2019
0.4.0 1.13.x Mar 01, 2019
0.3.0 1.12.0 Feb 15, 2019
0.2.0 1.12.0 Jan 29, 2019
0.1.0 1.12.0 Dec 16, 2018

Performance Benchmarking

We use github-pages to document the results of API performance benchmarks. The benchmark job is triggered on every commit to master branch and facilitates tracking performance w.r.t commits.

Contributing

Tensorflow I/O is a community led open source project. As such, the project depends on public contributions, bug-fixes, and documentation. Please see:

Build Status and CI

Build Status
Linux CPU Python 2 Status
Linux CPU Python 3 Status
Linux GPU Python 2 Status
Linux GPU Python 3 Status

Because of manylinux2010 requirement, TensorFlow I/O is built with Ubuntu:16.04 + Developer Toolset 7 (GCC 7.3) on Linux. Configuration with Ubuntu 16.04 with Developer Toolset 7 is not exactly straightforward. If the system have docker installed, then the following command will automatically build manylinux2010 compatible whl package:

#!/usr/bin/env bash

ls dist/*
for f in dist/*.whl; do
  docker run -i --rm -v $PWD:/v -w /v --net=host quay.io/pypa/manylinux2010_x86_64 bash -x -e /v/tools/build/auditwheel repair --plat manylinux2010_x86_64 $f
done
sudo chown -R $(id -nu):$(id -ng) .
ls wheelhouse/*

It takes some time to build, but once complete, there will be python 3.5, 3.6, 3.7 compatible whl packages available in wheelhouse directory.

On macOS, the same command could be used. However, the script expects python in shell and will only generate a whl package that matches the version of python in shell. If you want to build a whl package for a specific python then you have to alias this version of python to python in shell. See .github/workflows/build.yml Auditwheel step for instructions how to do that.

Note the above command is also the command we use when releasing packages for Linux and macOS.

TensorFlow I/O uses both GitHub Workflows and Google CI (Kokoro) for continuous integration. GitHub Workflows is used for macOS build and test. Kokoro is used for Linux build and test. Again, because of the manylinux2010 requirement, on Linux whl packages are always built with Ubuntu 16.04 + Developer Toolset 7. Tests are done on a variatiy of systems with different python3 versions to ensure a good coverage:

Python Ubuntu 18.04 Ubuntu 20.04 macOS + osx9 Windows-2019
2.7 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: N/A
3.7 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: :heavy_check_mark:
3.8 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: :heavy_check_mark:

TensorFlow I/O has integrations with many systems and cloud vendors such as Prometheus, Apache Kafka, Apache Ignite, Google Cloud PubSub, AWS Kinesis, Microsoft Azure Storage, Alibaba Cloud OSS etc.

We tried our best to test against those systems in our continuous integration whenever possible. Some tests such as Prometheus, Kafka, and Ignite are done with live systems, meaning we install Prometheus/Kafka/Ignite on CI machine before the test is run. Some tests such as Kinesis, PubSub, and Azure Storage are done through official or non-official emulators. Offline tests are also performed whenever possible, though systems covered through offine tests may not have the same level of coverage as live systems or emulators.

Live System Emulator CI Integration Offline
Apache Kafka :heavy_check_mark: :heavy_check_mark:
Apache Ignite :heavy_check_mark: :heavy_check_mark:
Prometheus :heavy_check_mark: :heavy_check_mark:
Google PubSub :heavy_check_mark: :heavy_check_mark:
Azure Storage :heavy_check_mark: :heavy_check_mark:
AWS Kinesis :heavy_check_mark: :heavy_check_mark:
Alibaba Cloud OSS :heavy_check_mark:
Google BigTable/BigQuery to be added
Elasticsearch (experimental) :heavy_check_mark: :heavy_check_mark:
MongoDB (experimental) :heavy_check_mark: :heavy_check_mark:

References for emulators:

Community

Additional Information

License

Apache License 2.0

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

tensorflow_io_gcs_filesystem-0.19.1-cp39-cp39-win_amd64.whl (1.4 MB view details)

Uploaded CPython 3.9 Windows x86-64

tensorflow_io_gcs_filesystem-0.19.1-cp39-cp39-manylinux_2_12_x86_64.manylinux2010_x86_64.whl (2.3 MB view details)

Uploaded CPython 3.9 manylinux: glibc 2.12+ x86-64

tensorflow_io_gcs_filesystem-0.19.1-cp39-cp39-macosx_10_14_x86_64.whl (1.6 MB view details)

Uploaded CPython 3.9 macOS 10.14+ x86-64

tensorflow_io_gcs_filesystem-0.19.1-cp38-cp38-win_amd64.whl (1.4 MB view details)

Uploaded CPython 3.8 Windows x86-64

tensorflow_io_gcs_filesystem-0.19.1-cp38-cp38-manylinux_2_12_x86_64.manylinux2010_x86_64.whl (2.3 MB view details)

Uploaded CPython 3.8 manylinux: glibc 2.12+ x86-64

tensorflow_io_gcs_filesystem-0.19.1-cp38-cp38-macosx_10_14_x86_64.whl (1.6 MB view details)

Uploaded CPython 3.8 macOS 10.14+ x86-64

tensorflow_io_gcs_filesystem-0.19.1-cp37-cp37m-win_amd64.whl (1.4 MB view details)

Uploaded CPython 3.7m Windows x86-64

tensorflow_io_gcs_filesystem-0.19.1-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl (2.3 MB view details)

Uploaded CPython 3.7m manylinux: glibc 2.12+ x86-64

tensorflow_io_gcs_filesystem-0.19.1-cp37-cp37m-macosx_10_14_x86_64.whl (1.6 MB view details)

Uploaded CPython 3.7m macOS 10.14+ x86-64

tensorflow_io_gcs_filesystem-0.19.1-cp36-cp36m-win_amd64.whl (1.4 MB view details)

Uploaded CPython 3.6m Windows x86-64

tensorflow_io_gcs_filesystem-0.19.1-cp36-cp36m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl (2.3 MB view details)

Uploaded CPython 3.6m manylinux: glibc 2.12+ x86-64

tensorflow_io_gcs_filesystem-0.19.1-cp36-cp36m-macosx_10_14_x86_64.whl (1.6 MB view details)

Uploaded CPython 3.6m macOS 10.14+ x86-64

File details

Details for the file tensorflow_io_gcs_filesystem-0.19.1-cp39-cp39-win_amd64.whl.

File metadata

  • Download URL: tensorflow_io_gcs_filesystem-0.19.1-cp39-cp39-win_amd64.whl
  • Upload date:
  • Size: 1.4 MB
  • Tags: CPython 3.9, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.6.1 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.61.2 CPython/3.8.10

File hashes

Hashes for tensorflow_io_gcs_filesystem-0.19.1-cp39-cp39-win_amd64.whl
Algorithm Hash digest
SHA256 47c0e35f4c591b7894ca2361184528f7b30f3cbf134634f117b0577e47aa8a43
MD5 a9917508af2765d327934f142403d3ff
BLAKE2b-256 cb86153ca2a75def9418675669c35f5ec1f64cace10cefe1da65b6ac4197df70

See more details on using hashes here.

File details

Details for the file tensorflow_io_gcs_filesystem-0.19.1-cp39-cp39-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_gcs_filesystem-0.19.1-cp39-cp39-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 1ca149fa2874e768558540b3e9059b9d50cc86f2683933c165a4ca54769080a4
MD5 0cc9ad599e764afc20476cd773d1488e
BLAKE2b-256 264cfd803e64d1fe4efd5138de0cf37f726faa69b2ac27913e2e1790f69727e3

See more details on using hashes here.

File details

Details for the file tensorflow_io_gcs_filesystem-0.19.1-cp39-cp39-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_gcs_filesystem-0.19.1-cp39-cp39-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 190a432548de5ad2ef1d5ffaafafc3297ffa8e355baa7a54d1ff9dd4bdad4223
MD5 e7b593a60dca8be9604d3267251697c2
BLAKE2b-256 3a03545ac988ba9545a30419b41370d5a1a9754a599c13842c7c365df95dbe65

See more details on using hashes here.

File details

Details for the file tensorflow_io_gcs_filesystem-0.19.1-cp38-cp38-win_amd64.whl.

File metadata

  • Download URL: tensorflow_io_gcs_filesystem-0.19.1-cp38-cp38-win_amd64.whl
  • Upload date:
  • Size: 1.4 MB
  • Tags: CPython 3.8, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.6.1 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.61.2 CPython/3.8.10

File hashes

Hashes for tensorflow_io_gcs_filesystem-0.19.1-cp38-cp38-win_amd64.whl
Algorithm Hash digest
SHA256 b2d5fa7b8d65464cbd2b52274135a474a4d81a91c895349c7cbf1dda6d45d8d8
MD5 fb295d6a1f7ea91e3374e84dce227816
BLAKE2b-256 97778374238ca073a55cd374440bab309171014943fdac38bf99ca5497af0f17

See more details on using hashes here.

File details

Details for the file tensorflow_io_gcs_filesystem-0.19.1-cp38-cp38-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_gcs_filesystem-0.19.1-cp38-cp38-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 c852088556a07ae0a81639bda367861efd4e46fa21740faad3d16a3c95f6f11c
MD5 e2a14df8d41f22e84466652bed361cfe
BLAKE2b-256 128b0105c12a801a70c214ad0b92aa563ade1a1d4cef506507892fbd38a76224

See more details on using hashes here.

File details

Details for the file tensorflow_io_gcs_filesystem-0.19.1-cp38-cp38-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_gcs_filesystem-0.19.1-cp38-cp38-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 70b56e1a633493453e15d614c42c4cc7e7c4b7092496666b763a6fc3cf96a4c4
MD5 69d92bff4e1f4e96bcd73f4454b63bd3
BLAKE2b-256 adb5a961f1c56ea7b6fc1b108d6987da01de60395f2a54eb57b6a9088989a091

See more details on using hashes here.

File details

Details for the file tensorflow_io_gcs_filesystem-0.19.1-cp37-cp37m-win_amd64.whl.

File metadata

  • Download URL: tensorflow_io_gcs_filesystem-0.19.1-cp37-cp37m-win_amd64.whl
  • Upload date:
  • Size: 1.4 MB
  • Tags: CPython 3.7m, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.6.1 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.61.2 CPython/3.8.10

File hashes

Hashes for tensorflow_io_gcs_filesystem-0.19.1-cp37-cp37m-win_amd64.whl
Algorithm Hash digest
SHA256 77186ef42fe85119ccd97356065acd1d69651200160d944bda0711b48e9cce96
MD5 13faaa54653152f795acbf7b32f3125c
BLAKE2b-256 2a56827edfff2c0d4485cafdb843b835173731e576f72cf0649f417968445662

See more details on using hashes here.

File details

Details for the file tensorflow_io_gcs_filesystem-0.19.1-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_gcs_filesystem-0.19.1-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 9cd718af3f51d08886ca862ee0d3f1696ebea146cf9826b93edc946c053e0ddb
MD5 9f0940cca7a741d333c2547c07d0915c
BLAKE2b-256 f8a1d9daface8a565a109a71b45f385ac1fb3049d91463315152d9693acc80b2

See more details on using hashes here.

File details

Details for the file tensorflow_io_gcs_filesystem-0.19.1-cp37-cp37m-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_gcs_filesystem-0.19.1-cp37-cp37m-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 ac155105ee2b1a28016aaf04a9783ab15aaebc407f7fe9886bf0405f5fa233d9
MD5 09629e5b6a880a3df2359f916caff9ad
BLAKE2b-256 7c8edd26eaafa6548cbac8095dda170829b7f3c14faa05e1ded92cfed4404941

See more details on using hashes here.

File details

Details for the file tensorflow_io_gcs_filesystem-0.19.1-cp36-cp36m-win_amd64.whl.

File metadata

  • Download URL: tensorflow_io_gcs_filesystem-0.19.1-cp36-cp36m-win_amd64.whl
  • Upload date:
  • Size: 1.4 MB
  • Tags: CPython 3.6m, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.2 importlib_metadata/4.6.1 pkginfo/1.7.1 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.61.2 CPython/3.8.10

File hashes

Hashes for tensorflow_io_gcs_filesystem-0.19.1-cp36-cp36m-win_amd64.whl
Algorithm Hash digest
SHA256 a61924fd6ae845974dda1f5962a9693ff14a84879a6730d0c858ea3a2045944e
MD5 c533d20b04c28d25a94d127084d65699
BLAKE2b-256 f6dd5d8fc1aa7e27501070c21d729d98bf586bb5dc88bd3671e4a84651d4c829

See more details on using hashes here.

File details

Details for the file tensorflow_io_gcs_filesystem-0.19.1-cp36-cp36m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_gcs_filesystem-0.19.1-cp36-cp36m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 3d7dbf377d185793d5643256ef80d869b21af2be76af1117bb1bc2ab8c5d5258
MD5 c5b81bb9b404e25c12940664aa533b98
BLAKE2b-256 93af6c2e68e8569b90c08f7c06ce1fc851cc9204f724871d485a9ac01d1c48df

See more details on using hashes here.

File details

Details for the file tensorflow_io_gcs_filesystem-0.19.1-cp36-cp36m-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_gcs_filesystem-0.19.1-cp36-cp36m-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 01d79022cd385f92adc7861f3d99d86c328ef116c2795e125754cac11780c607
MD5 900f980471b88fb2c3beb512020d82db
BLAKE2b-256 c8bb043332ace89c588ff11c63b6d30ba01a788ba90e38931b1c85d61e919048

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page