Skip to main content

TensorFlow IO

Project description




TensorFlow I/O

GitHub CI PyPI License Documentation

TensorFlow I/O is a collection of file systems and file formats that are not available in TensorFlow's built-in support. A full list of supported file systems and file formats by TensorFlow I/O can be found here.

The use of tensorflow-io is straightforward with keras. Below is an example to Get Started with TensorFlow with the data processing aspect replaced by tensorflow-io:

import tensorflow as tf
import tensorflow_io as tfio

# Read the MNIST data into the IODataset.
dataset_url = "https://storage.googleapis.com/cvdf-datasets/mnist/"
d_train = tfio.IODataset.from_mnist(
    dataset_url + "train-images-idx3-ubyte.gz",
    dataset_url + "train-labels-idx1-ubyte.gz",
)

# Shuffle the elements of the dataset.
d_train = d_train.shuffle(buffer_size=1024)

# By default image data is uint8, so convert to float32 using map().
d_train = d_train.map(lambda x, y: (tf.image.convert_image_dtype(x, tf.float32), y))

# prepare batches the data just like any other tf.data.Dataset
d_train = d_train.batch(32)

# Build the model.
model = tf.keras.models.Sequential(
    [
        tf.keras.layers.Flatten(input_shape=(28, 28)),
        tf.keras.layers.Dense(512, activation=tf.nn.relu),
        tf.keras.layers.Dropout(0.2),
        tf.keras.layers.Dense(10, activation=tf.nn.softmax),
    ]
)

# Compile the model.
model.compile(
    optimizer="adam", loss="sparse_categorical_crossentropy", metrics=["accuracy"]
)

# Fit the model.
model.fit(d_train, epochs=5, steps_per_epoch=200)

In the above MNIST example, the URL's to access the dataset files are passed directly to the tfio.IODataset.from_mnist API call. This is due to the inherent support that tensorflow-io provides for HTTP/HTTPS file system, thus eliminating the need for downloading and saving datasets on a local directory.

NOTE: Since tensorflow-io is able to detect and uncompress the MNIST dataset automatically if needed, we can pass the URL's for the compressed files (gzip) to the API call as is.

Please check the official documentation for more detailed and interesting usages of the package.

Installation

Python Package

The tensorflow-io Python package can be installed with pip directly using:

$ pip install tensorflow-io

People who are a little more adventurous can also try our nightly binaries:

$ pip install tensorflow-io-nightly

Docker Images

In addition to the pip packages, the docker images can be used to quickly get started.

For stable builds:

$ docker pull tfsigio/tfio:latest
$ docker run -it --rm --name tfio-latest tfsigio/tfio:latest

For nightly builds:

$ docker pull tfsigio/tfio:nightly
$ docker run -it --rm --name tfio-nightly tfsigio/tfio:nightly

R Package

Once the tensorflow-io Python package has been successfully installed, you can install the development version of the R package from GitHub via the following:

if (!require("remotes")) install.packages("remotes")
remotes::install_github("tensorflow/io", subdir = "R-package")

TensorFlow Version Compatibility

To ensure compatibility with TensorFlow, it is recommended to install a matching version of TensorFlow I/O according to the table below. You can find the list of releases here.

TensorFlow I/O Version TensorFlow Compatibility Release Date
0.19.0 2.5.x Jun 25, 2021
0.18.0 2.5.x May 13, 2021
0.17.1 2.4.x Apr 16, 2021
0.17.0 2.4.x Dec 14, 2020
0.16.0 2.3.x Oct 23, 2020
0.15.0 2.3.x Aug 03, 2020
0.14.0 2.2.x Jul 08, 2020
0.13.0 2.2.x May 10, 2020
0.12.0 2.1.x Feb 28, 2020
0.11.0 2.1.x Jan 10, 2020
0.10.0 2.0.x Dec 05, 2019
0.9.1 2.0.x Nov 15, 2019
0.9.0 2.0.x Oct 18, 2019
0.8.1 1.15.x Nov 15, 2019
0.8.0 1.15.x Oct 17, 2019
0.7.2 1.14.x Nov 15, 2019
0.7.1 1.14.x Oct 18, 2019
0.7.0 1.14.x Jul 14, 2019
0.6.0 1.13.x May 29, 2019
0.5.0 1.13.x Apr 12, 2019
0.4.0 1.13.x Mar 01, 2019
0.3.0 1.12.0 Feb 15, 2019
0.2.0 1.12.0 Jan 29, 2019
0.1.0 1.12.0 Dec 16, 2018

Performance Benchmarking

We use github-pages to document the results of API performance benchmarks. The benchmark job is triggered on every commit to master branch and facilitates tracking performance w.r.t commits.

Contributing

Tensorflow I/O is a community led open source project. As such, the project depends on public contributions, bug-fixes, and documentation. Please see:

Build Status and CI

Build Status
Linux CPU Python 2 Status
Linux CPU Python 3 Status
Linux GPU Python 2 Status
Linux GPU Python 3 Status

Because of manylinux2010 requirement, TensorFlow I/O is built with Ubuntu:16.04 + Developer Toolset 7 (GCC 7.3) on Linux. Configuration with Ubuntu 16.04 with Developer Toolset 7 is not exactly straightforward. If the system have docker installed, then the following command will automatically build manylinux2010 compatible whl package:

#!/usr/bin/env bash

ls dist/*
for f in dist/*.whl; do
  docker run -i --rm -v $PWD:/v -w /v --net=host quay.io/pypa/manylinux2010_x86_64 bash -x -e /v/tools/build/auditwheel repair --plat manylinux2010_x86_64 $f
done
sudo chown -R $(id -nu):$(id -ng) .
ls wheelhouse/*

It takes some time to build, but once complete, there will be python 3.5, 3.6, 3.7 compatible whl packages available in wheelhouse directory.

On macOS, the same command could be used. However, the script expects python in shell and will only generate a whl package that matches the version of python in shell. If you want to build a whl package for a specific python then you have to alias this version of python to python in shell. See .github/workflows/build.yml Auditwheel step for instructions how to do that.

Note the above command is also the command we use when releasing packages for Linux and macOS.

TensorFlow I/O uses both GitHub Workflows and Google CI (Kokoro) for continuous integration. GitHub Workflows is used for macOS build and test. Kokoro is used for Linux build and test. Again, because of the manylinux2010 requirement, on Linux whl packages are always built with Ubuntu 16.04 + Developer Toolset 7. Tests are done on a variatiy of systems with different python3 versions to ensure a good coverage:

Python Ubuntu 18.04 Ubuntu 20.04 macOS + osx9 Windows-2019
2.7 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: N/A
3.7 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: :heavy_check_mark:
3.8 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: :heavy_check_mark:

TensorFlow I/O has integrations with many systems and cloud vendors such as Prometheus, Apache Kafka, Apache Ignite, Google Cloud PubSub, AWS Kinesis, Microsoft Azure Storage, Alibaba Cloud OSS etc.

We tried our best to test against those systems in our continuous integration whenever possible. Some tests such as Prometheus, Kafka, and Ignite are done with live systems, meaning we install Prometheus/Kafka/Ignite on CI machine before the test is run. Some tests such as Kinesis, PubSub, and Azure Storage are done through official or non-official emulators. Offline tests are also performed whenever possible, though systems covered through offine tests may not have the same level of coverage as live systems or emulators.

Live System Emulator CI Integration Offline
Apache Kafka :heavy_check_mark: :heavy_check_mark:
Apache Ignite :heavy_check_mark: :heavy_check_mark:
Prometheus :heavy_check_mark: :heavy_check_mark:
Google PubSub :heavy_check_mark: :heavy_check_mark:
Azure Storage :heavy_check_mark: :heavy_check_mark:
AWS Kinesis :heavy_check_mark: :heavy_check_mark:
Alibaba Cloud OSS :heavy_check_mark:
Google BigTable/BigQuery to be added
Elasticsearch (experimental) :heavy_check_mark: :heavy_check_mark:
MongoDB (experimental) :heavy_check_mark: :heavy_check_mark:

References for emulators:

Community

Additional Information

License

Apache License 2.0

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

tensorflow_io_gcs_filesystem-0.19.0-cp39-cp39-win_amd64.whl (1.7 MB view details)

Uploaded CPython 3.9 Windows x86-64

tensorflow_io_gcs_filesystem-0.19.0-cp39-cp39-manylinux_2_12_x86_64.manylinux2010_x86_64.whl (2.3 MB view details)

Uploaded CPython 3.9 manylinux: glibc 2.12+ x86-64

tensorflow_io_gcs_filesystem-0.19.0-cp39-cp39-macosx_10_14_x86_64.whl (1.8 MB view details)

Uploaded CPython 3.9 macOS 10.14+ x86-64

tensorflow_io_gcs_filesystem-0.19.0-cp38-cp38-win_amd64.whl (1.7 MB view details)

Uploaded CPython 3.8 Windows x86-64

tensorflow_io_gcs_filesystem-0.19.0-cp38-cp38-manylinux_2_12_x86_64.manylinux2010_x86_64.whl (2.3 MB view details)

Uploaded CPython 3.8 manylinux: glibc 2.12+ x86-64

tensorflow_io_gcs_filesystem-0.19.0-cp38-cp38-macosx_10_14_x86_64.whl (1.8 MB view details)

Uploaded CPython 3.8 macOS 10.14+ x86-64

tensorflow_io_gcs_filesystem-0.19.0-cp37-cp37m-win_amd64.whl (1.7 MB view details)

Uploaded CPython 3.7m Windows x86-64

tensorflow_io_gcs_filesystem-0.19.0-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl (2.3 MB view details)

Uploaded CPython 3.7m manylinux: glibc 2.12+ x86-64

tensorflow_io_gcs_filesystem-0.19.0-cp37-cp37m-macosx_10_14_x86_64.whl (1.8 MB view details)

Uploaded CPython 3.7m macOS 10.14+ x86-64

tensorflow_io_gcs_filesystem-0.19.0-cp36-cp36m-win_amd64.whl (1.7 MB view details)

Uploaded CPython 3.6m Windows x86-64

tensorflow_io_gcs_filesystem-0.19.0-cp36-cp36m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl (2.3 MB view details)

Uploaded CPython 3.6m manylinux: glibc 2.12+ x86-64

tensorflow_io_gcs_filesystem-0.19.0-cp36-cp36m-macosx_10_14_x86_64.whl (1.8 MB view details)

Uploaded CPython 3.6m macOS 10.14+ x86-64

File details

Details for the file tensorflow_io_gcs_filesystem-0.19.0-cp39-cp39-win_amd64.whl.

File metadata

  • Download URL: tensorflow_io_gcs_filesystem-0.19.0-cp39-cp39-win_amd64.whl
  • Upload date:
  • Size: 1.7 MB
  • Tags: CPython 3.9, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/4.5.0 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.61.1 CPython/3.8.5

File hashes

Hashes for tensorflow_io_gcs_filesystem-0.19.0-cp39-cp39-win_amd64.whl
Algorithm Hash digest
SHA256 1135f64e57e69dc5d290869fb7183406239dce194c311edb1f2a84af9fec0d35
MD5 068afef8d4258748c341c93d84c5d728
BLAKE2b-256 ee4a26f55f3faeb6ef8e8738dfd75b6fac805169197f2dfe1787768809ab5077

See more details on using hashes here.

File details

Details for the file tensorflow_io_gcs_filesystem-0.19.0-cp39-cp39-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_gcs_filesystem-0.19.0-cp39-cp39-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 f0a851d82f356f50ef4423ac9a3a510193c981f97a039031fe8aff0deae42a46
MD5 4c1c3e3564525b1c1fa7bc20d047d756
BLAKE2b-256 7b8ab9dda6c37fde85f80f8b69dc4c1bb0fcac922899c1a4abaa97e7d747f4c9

See more details on using hashes here.

File details

Details for the file tensorflow_io_gcs_filesystem-0.19.0-cp39-cp39-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_gcs_filesystem-0.19.0-cp39-cp39-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 6c543fe3250cf368e32abcc8066276acb4baad9c2f9fddbb312c99ecaf787c40
MD5 86a5312f91e207ae429c3ac1cd3cb3d8
BLAKE2b-256 b757319cc30c643a579e0ee014da2c3bef9a470298e81d3ffc3a8699ab7c77bd

See more details on using hashes here.

File details

Details for the file tensorflow_io_gcs_filesystem-0.19.0-cp38-cp38-win_amd64.whl.

File metadata

  • Download URL: tensorflow_io_gcs_filesystem-0.19.0-cp38-cp38-win_amd64.whl
  • Upload date:
  • Size: 1.7 MB
  • Tags: CPython 3.8, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/4.5.0 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.61.1 CPython/3.8.5

File hashes

Hashes for tensorflow_io_gcs_filesystem-0.19.0-cp38-cp38-win_amd64.whl
Algorithm Hash digest
SHA256 e6990a3063dac3e2ca177986d6d43d1c6ada67f4fd52bcf36bb5249d895460c5
MD5 28917bc33140e948123ee31f71c3da5a
BLAKE2b-256 82bec80fce3a26743fb7ce36be1081169ce6003c2a461cc584b94e0a78753ace

See more details on using hashes here.

File details

Details for the file tensorflow_io_gcs_filesystem-0.19.0-cp38-cp38-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_gcs_filesystem-0.19.0-cp38-cp38-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 d5cc65aec979a6329a360d8e6d10c13f13cac404892bb6df797080f406e6be51
MD5 ef0adaa83af3facfcefebf61dd77d787
BLAKE2b-256 5318d1cf54cadfdc4b64c6cab1bb6df62913c5c6f478727ef2dd4bb79d30bdca

See more details on using hashes here.

File details

Details for the file tensorflow_io_gcs_filesystem-0.19.0-cp38-cp38-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_gcs_filesystem-0.19.0-cp38-cp38-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 eb3a7b2861f0d090880d10a6ff5a44f5da538089bcc3a2e66f69aca5d9d6ec70
MD5 041e2df3825f4ed0f05818172ebb2808
BLAKE2b-256 d38a8b489147b7fb6857dd8bec00445132a3a30556e1cf2a87c872d04af2f71b

See more details on using hashes here.

File details

Details for the file tensorflow_io_gcs_filesystem-0.19.0-cp37-cp37m-win_amd64.whl.

File metadata

  • Download URL: tensorflow_io_gcs_filesystem-0.19.0-cp37-cp37m-win_amd64.whl
  • Upload date:
  • Size: 1.7 MB
  • Tags: CPython 3.7m, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/4.5.0 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.61.1 CPython/3.8.5

File hashes

Hashes for tensorflow_io_gcs_filesystem-0.19.0-cp37-cp37m-win_amd64.whl
Algorithm Hash digest
SHA256 515368da8ef7ae4c158ff0fab4857d831634426d0174a764bd228c8de12ff349
MD5 d6539d1e84ff24efe9a8e305c42c8a15
BLAKE2b-256 67ba0072f858251183ae41e1d3602a3c415d9a0a639a6a0dbd2850354f12bee6

See more details on using hashes here.

File details

Details for the file tensorflow_io_gcs_filesystem-0.19.0-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_gcs_filesystem-0.19.0-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 8862604ce96aa664e0fba939e3d7f4b920567c871fa2f4aad112e1cf40866498
MD5 e5afa418545467de18ee58ad56d42938
BLAKE2b-256 5d3a5c1cc819ff1adfd47fa119a8b904a12207c64bdb1f61f2ef726f03a0cdc6

See more details on using hashes here.

File details

Details for the file tensorflow_io_gcs_filesystem-0.19.0-cp37-cp37m-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_gcs_filesystem-0.19.0-cp37-cp37m-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 89e5296775c8f02c244f09f3f105eae36123c825ada88822196c9fd50a711583
MD5 43d346ad3d11113b7de2db2dfa5b1693
BLAKE2b-256 d058733a2608c556669b082a90d01ce9ecc0498add2c32cbd3eb229561d91149

See more details on using hashes here.

File details

Details for the file tensorflow_io_gcs_filesystem-0.19.0-cp36-cp36m-win_amd64.whl.

File metadata

  • Download URL: tensorflow_io_gcs_filesystem-0.19.0-cp36-cp36m-win_amd64.whl
  • Upload date:
  • Size: 1.7 MB
  • Tags: CPython 3.6m, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.4.1 importlib_metadata/4.5.0 pkginfo/1.7.0 requests/2.25.1 requests-toolbelt/0.9.1 tqdm/4.61.1 CPython/3.8.5

File hashes

Hashes for tensorflow_io_gcs_filesystem-0.19.0-cp36-cp36m-win_amd64.whl
Algorithm Hash digest
SHA256 40a17de5e1e1f72564b0b2dbcc661b30c09c6f30bfb722ef331fd8d03fa9e31c
MD5 077dfe189645d90be9618b3303d58f54
BLAKE2b-256 f8fbc909d5724ebc96bf0e90501afada8aac73d738b5d3c6d278686ecf89235b

See more details on using hashes here.

File details

Details for the file tensorflow_io_gcs_filesystem-0.19.0-cp36-cp36m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_gcs_filesystem-0.19.0-cp36-cp36m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 f6c055d2e49af5e387ec9d9ff408343a52cbfd9158b553acc58ad082e1cb8f93
MD5 1dafea7e781bd865daf770bd5dfe22fd
BLAKE2b-256 cd1ce8471ece1fc4808e521e9b668b1928c44857fb4e7a8c1085c5b232aa4d66

See more details on using hashes here.

File details

Details for the file tensorflow_io_gcs_filesystem-0.19.0-cp36-cp36m-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_gcs_filesystem-0.19.0-cp36-cp36m-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 f75bed42e661c7c7815629314918a5ee189d4a4ecab4a163be9e9a991aadef67
MD5 d0d107efd13503e8ad0a4908a1cf7cd9
BLAKE2b-256 9871954b99671d9a09315bcc1d81082c462a643700c16209bc0cc93795f48512

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page