Skip to main content

TensorFlow IO

Project description




TensorFlow I/O

GitHub CI PyPI License Documentation

TensorFlow I/O is a collection of file systems and file formats that are not available in TensorFlow's built-in support. A full list of supported file systems and file formats by TensorFlow I/O can be found here.

The use of tensorflow-io is straightforward with keras. Below is an example to Get Started with TensorFlow with the data processing aspect replaced by tensorflow-io:

import tensorflow as tf
import tensorflow_io as tfio

# Read the MNIST data into the IODataset.
dataset_url = "https://storage.googleapis.com/cvdf-datasets/mnist/"
d_train = tfio.IODataset.from_mnist(
    dataset_url + "train-images-idx3-ubyte.gz",
    dataset_url + "train-labels-idx1-ubyte.gz",
)

# Shuffle the elements of the dataset.
d_train = d_train.shuffle(buffer_size=1024)

# By default image data is uint8, so convert to float32 using map().
d_train = d_train.map(lambda x, y: (tf.image.convert_image_dtype(x, tf.float32), y))

# prepare batches the data just like any other tf.data.Dataset
d_train = d_train.batch(32)

# Build the model.
model = tf.keras.models.Sequential(
    [
        tf.keras.layers.Flatten(input_shape=(28, 28)),
        tf.keras.layers.Dense(512, activation=tf.nn.relu),
        tf.keras.layers.Dropout(0.2),
        tf.keras.layers.Dense(10, activation=tf.nn.softmax),
    ]
)

# Compile the model.
model.compile(
    optimizer="adam", loss="sparse_categorical_crossentropy", metrics=["accuracy"]
)

# Fit the model.
model.fit(d_train, epochs=5, steps_per_epoch=200)

In the above MNIST example, the URL's to access the dataset files are passed directly to the tfio.IODataset.from_mnist API call. This is due to the inherent support that tensorflow-io provides for HTTP/HTTPS file system, thus eliminating the need for downloading and saving datasets on a local directory.

NOTE: Since tensorflow-io is able to detect and uncompress the MNIST dataset automatically if needed, we can pass the URL's for the compressed files (gzip) to the API call as is.

Please check the official documentation for more detailed and interesting usages of the package.

Installation

Python Package

The tensorflow-io Python package can be installed with pip directly using:

$ pip install tensorflow-io

People who are a little more adventurous can also try our nightly binaries:

$ pip install tensorflow-io-nightly

To ensure you have a version of TensorFlow that is compatible with TensorFlow-IO, you can specify the tensorflow extra requirement during install:

pip install tensorflow-io[tensorflow]

Similar extras exist for the tensorflow-gpu, tensorflow-cpu and tensorflow-rocm packages.

Docker Images

In addition to the pip packages, the docker images can be used to quickly get started.

For stable builds:

$ docker pull tfsigio/tfio:latest
$ docker run -it --rm --name tfio-latest tfsigio/tfio:latest

For nightly builds:

$ docker pull tfsigio/tfio:nightly
$ docker run -it --rm --name tfio-nightly tfsigio/tfio:nightly

R Package

Once the tensorflow-io Python package has been successfully installed, you can install the development version of the R package from GitHub via the following:

if (!require("remotes")) install.packages("remotes")
remotes::install_github("tensorflow/io", subdir = "R-package")

TensorFlow Version Compatibility

To ensure compatibility with TensorFlow, it is recommended to install a matching version of TensorFlow I/O according to the table below. You can find the list of releases here.

TensorFlow I/O Version TensorFlow Compatibility Release Date
0.28.0 2.11.x Nov 21, 2022
0.27.0 2.10.x Sep 08, 2022
0.26.0 2.9.x May 17, 2022
0.25.0 2.8.x Apr 19, 2022
0.24.0 2.8.x Feb 04, 2022
0.23.1 2.7.x Dec 15, 2021
0.23.0 2.7.x Dec 14, 2021
0.22.0 2.7.x Nov 10, 2021
0.21.0 2.6.x Sep 12, 2021
0.20.0 2.6.x Aug 11, 2021
0.19.1 2.5.x Jul 25, 2021
0.19.0 2.5.x Jun 25, 2021
0.18.0 2.5.x May 13, 2021
0.17.1 2.4.x Apr 16, 2021
0.17.0 2.4.x Dec 14, 2020
0.16.0 2.3.x Oct 23, 2020
0.15.0 2.3.x Aug 03, 2020
0.14.0 2.2.x Jul 08, 2020
0.13.0 2.2.x May 10, 2020
0.12.0 2.1.x Feb 28, 2020
0.11.0 2.1.x Jan 10, 2020
0.10.0 2.0.x Dec 05, 2019
0.9.1 2.0.x Nov 15, 2019
0.9.0 2.0.x Oct 18, 2019
0.8.1 1.15.x Nov 15, 2019
0.8.0 1.15.x Oct 17, 2019
0.7.2 1.14.x Nov 15, 2019
0.7.1 1.14.x Oct 18, 2019
0.7.0 1.14.x Jul 14, 2019
0.6.0 1.13.x May 29, 2019
0.5.0 1.13.x Apr 12, 2019
0.4.0 1.13.x Mar 01, 2019
0.3.0 1.12.0 Feb 15, 2019
0.2.0 1.12.0 Jan 29, 2019
0.1.0 1.12.0 Dec 16, 2018

Performance Benchmarking

We use github-pages to document the results of API performance benchmarks. The benchmark job is triggered on every commit to master branch and facilitates tracking performance w.r.t commits.

Contributing

Tensorflow I/O is a community led open source project. As such, the project depends on public contributions, bug-fixes, and documentation. Please see:

Build Status and CI

Build Status
Linux CPU Python 2 Status
Linux CPU Python 3 Status
Linux GPU Python 2 Status
Linux GPU Python 3 Status

Because of manylinux2010 requirement, TensorFlow I/O is built with Ubuntu:16.04 + Developer Toolset 7 (GCC 7.3) on Linux. Configuration with Ubuntu 16.04 with Developer Toolset 7 is not exactly straightforward. If the system have docker installed, then the following command will automatically build manylinux2010 compatible whl package:

#!/usr/bin/env bash

ls dist/*
for f in dist/*.whl; do
  docker run -i --rm -v $PWD:/v -w /v --net=host quay.io/pypa/manylinux2010_x86_64 bash -x -e /v/tools/build/auditwheel repair --plat manylinux2010_x86_64 $f
done
sudo chown -R $(id -nu):$(id -ng) .
ls wheelhouse/*

It takes some time to build, but once complete, there will be python 3.5, 3.6, 3.7 compatible whl packages available in wheelhouse directory.

On macOS, the same command could be used. However, the script expects python in shell and will only generate a whl package that matches the version of python in shell. If you want to build a whl package for a specific python then you have to alias this version of python to python in shell. See .github/workflows/build.yml Auditwheel step for instructions how to do that.

Note the above command is also the command we use when releasing packages for Linux and macOS.

TensorFlow I/O uses both GitHub Workflows and Google CI (Kokoro) for continuous integration. GitHub Workflows is used for macOS build and test. Kokoro is used for Linux build and test. Again, because of the manylinux2010 requirement, on Linux whl packages are always built with Ubuntu 16.04 + Developer Toolset 7. Tests are done on a variatiy of systems with different python3 versions to ensure a good coverage:

Python Ubuntu 18.04 Ubuntu 20.04 macOS + osx9 Windows-2019
2.7 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: N/A
3.7 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: :heavy_check_mark:
3.8 :heavy_check_mark: :heavy_check_mark: :heavy_check_mark: :heavy_check_mark:

TensorFlow I/O has integrations with many systems and cloud vendors such as Prometheus, Apache Kafka, Apache Ignite, Google Cloud PubSub, AWS Kinesis, Microsoft Azure Storage, Alibaba Cloud OSS etc.

We tried our best to test against those systems in our continuous integration whenever possible. Some tests such as Prometheus, Kafka, and Ignite are done with live systems, meaning we install Prometheus/Kafka/Ignite on CI machine before the test is run. Some tests such as Kinesis, PubSub, and Azure Storage are done through official or non-official emulators. Offline tests are also performed whenever possible, though systems covered through offine tests may not have the same level of coverage as live systems or emulators.

Live System Emulator CI Integration Offline
Apache Kafka :heavy_check_mark: :heavy_check_mark:
Apache Ignite :heavy_check_mark: :heavy_check_mark:
Prometheus :heavy_check_mark: :heavy_check_mark:
Google PubSub :heavy_check_mark: :heavy_check_mark:
Azure Storage :heavy_check_mark: :heavy_check_mark:
AWS Kinesis :heavy_check_mark: :heavy_check_mark:
Alibaba Cloud OSS :heavy_check_mark:
Google BigTable/BigQuery to be added
Elasticsearch (experimental) :heavy_check_mark: :heavy_check_mark:
MongoDB (experimental) :heavy_check_mark: :heavy_check_mark:

References for emulators:

Community

Additional Information

License

Apache License 2.0

Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

tensorflow_io_nightly-0.28.0.dev20221203124204-cp311-cp311-win_amd64.whl (22.9 MB view details)

Uploaded CPython 3.11 Windows x86-64

tensorflow_io_nightly-0.28.0.dev20221203124204-cp311-cp311-macosx_10_14_x86_64.whl (24.6 MB view details)

Uploaded CPython 3.11 macOS 10.14+ x86-64

tensorflow_io_nightly-0.28.0.dev20221203124204-cp310-cp310-win_amd64.whl (22.9 MB view details)

Uploaded CPython 3.10 Windows x86-64

tensorflow_io_nightly-0.28.0.dev20221203124204-cp310-cp310-macosx_10_14_x86_64.whl (24.6 MB view details)

Uploaded CPython 3.10 macOS 10.14+ x86-64

tensorflow_io_nightly-0.28.0.dev20221203124204-cp39-cp39-win_amd64.whl (22.9 MB view details)

Uploaded CPython 3.9 Windows x86-64

tensorflow_io_nightly-0.28.0.dev20221203124204-cp39-cp39-macosx_10_14_x86_64.whl (24.6 MB view details)

Uploaded CPython 3.9 macOS 10.14+ x86-64

tensorflow_io_nightly-0.28.0.dev20221203124204-cp38-cp38-win_amd64.whl (22.9 MB view details)

Uploaded CPython 3.8 Windows x86-64

tensorflow_io_nightly-0.28.0.dev20221203124204-cp38-cp38-macosx_10_14_x86_64.whl (24.6 MB view details)

Uploaded CPython 3.8 macOS 10.14+ x86-64

tensorflow_io_nightly-0.28.0.dev20221203124204-cp37-cp37m-win_amd64.whl (22.9 MB view details)

Uploaded CPython 3.7m Windows x86-64

tensorflow_io_nightly-0.28.0.dev20221203124204-cp37-cp37m-macosx_10_14_x86_64.whl (24.6 MB view details)

Uploaded CPython 3.7m macOS 10.14+ x86-64

File details

Details for the file tensorflow_io_nightly-0.28.0.dev20221203124204-cp311-cp311-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.28.0.dev20221203124204-cp311-cp311-win_amd64.whl
Algorithm Hash digest
SHA256 459f1df059bd87cb1e881f6be08b9e95fb06cad9a48dfb7f58a3487a53c848e1
MD5 b5699fb5236238908882e74803189654
BLAKE2b-256 3293a73081a0af0d6f6d50437fa97c4c7a28ab389d2577b89671b5ed4a17a670

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.28.0.dev20221203124204-cp311-cp311-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.28.0.dev20221203124204-cp311-cp311-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 aab59e0f87241b0d8c7dae62f4b8558a834a376d7e573c98198e568cea1530b4
MD5 2640bb61a259ac4a5256faa2b8e7d7ac
BLAKE2b-256 1d46534aa0e41a2387d3ad3c7ede279b2a11b1bb3c1a9c58f0e5edfc62a4c08b

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.28.0.dev20221203124204-cp311-cp311-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.28.0.dev20221203124204-cp311-cp311-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 e793cf14374114f836c252db72e1f1aa45e655dd55b8d8eb7dfb3080df699ff0
MD5 f885d9aadc3e157010d0b74ff9ce58be
BLAKE2b-256 288d6c239e854de7d50f2be7e26cad3afa0725a8fb8fee922e6059b624c28a44

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.28.0.dev20221203124204-cp310-cp310-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.28.0.dev20221203124204-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 494a864e2e48c605459494b100107bb3bec32a3850bb03db116d790987495f9d
MD5 f9a59b17c1e34f2948457de319111cf8
BLAKE2b-256 52a25ae1f6ad5a080fc65b096d3f1cbf8d88421448201185ea44274d1c073946

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.28.0.dev20221203124204-cp310-cp310-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.28.0.dev20221203124204-cp310-cp310-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 f6a378b9cf3cbddc5d318c701d42e2f1e305d1cf2d5135e5c14f82000b3faef8
MD5 e3d6413cb9704b527602c0bcf15cb135
BLAKE2b-256 fcb2fd6215c8a26745a49ccb0d3e8b334e8e9ed2407bdfff8edcab9865965c06

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.28.0.dev20221203124204-cp310-cp310-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.28.0.dev20221203124204-cp310-cp310-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 e6d1c354848091ae94f246b658d0d97550d0de9e421e54b093a44d08ee20f834
MD5 c1ed2ea86ccf09e67573622f48bf247e
BLAKE2b-256 1fa7cccb74684a06ca6b9658b0fb90aef59b90d02acd4ee086f29b24cf041d6b

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.28.0.dev20221203124204-cp39-cp39-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.28.0.dev20221203124204-cp39-cp39-win_amd64.whl
Algorithm Hash digest
SHA256 2b509e467052ac0065b17e3ba351f87bac03e5f039a9c265789b571905b6da88
MD5 fc485f6f4ed1230b5cf3e1e9c14eb539
BLAKE2b-256 4663bd9efc844247af5223d3290529e2d14d4ce57be2c46dcc3681d7b142c496

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.28.0.dev20221203124204-cp39-cp39-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.28.0.dev20221203124204-cp39-cp39-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 5d1315a4176ba9cfe74fa54f882c73dd0ac7acbe23dec84719fc9d773852e8db
MD5 f16896d5c48c14f5efeaffec5bf7e430
BLAKE2b-256 bcd306b93642fa3f5a2b74ebb1c03d7c69b9b17f960f04244f031128b0c5003a

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.28.0.dev20221203124204-cp39-cp39-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.28.0.dev20221203124204-cp39-cp39-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 93278388f4216501d0781f1d56ed3093bece02314653192d1f405c38016a7951
MD5 79396b7d459030466539099b5e59bdda
BLAKE2b-256 698e3b086f04ce1697e053b73b98e7f90f6a3b8f2daf6212750729f995fd15ab

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.28.0.dev20221203124204-cp38-cp38-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.28.0.dev20221203124204-cp38-cp38-win_amd64.whl
Algorithm Hash digest
SHA256 c4b1b1c38176872a734808f2ef5b9fc5e6e60d7c298ecff03565bab5720a9432
MD5 fc9b98c422c45245b92cba3176b5cd42
BLAKE2b-256 2b461ae71de5118b3efd3d024af94029bc87df0dde9894d5b209dd96132f79ac

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.28.0.dev20221203124204-cp38-cp38-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.28.0.dev20221203124204-cp38-cp38-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 af917a39d4dde5a1167d90b5b27548b2c35d5297aacff3c57033bc81aa714f81
MD5 caa846c4783b56df24ac66480a851a66
BLAKE2b-256 d52cc2baa79aac027437f3e637c35cd5af0313105612a08f87563a3d28d98d02

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.28.0.dev20221203124204-cp38-cp38-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.28.0.dev20221203124204-cp38-cp38-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 75efd8293b75c761c2cc381843afda8c6adcf0ebd694bae376431f8bb79ceaad
MD5 d4354d228b89fba57b68a3779ace8f2a
BLAKE2b-256 1ff6dc5de6d1c967597b1fdaa1ebd5e2a198305b4fba87f19accfbc0ad4f6390

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.28.0.dev20221203124204-cp37-cp37m-win_amd64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.28.0.dev20221203124204-cp37-cp37m-win_amd64.whl
Algorithm Hash digest
SHA256 c734053643d34a71aff050a3276188c6320f141fe9547a964f1985987811f7b2
MD5 0820d23ccccd194cf5c32993b546d3f6
BLAKE2b-256 aef7648fc6b2273236aa3f254d9d7c84a3685ae68fc4d960b0f792a40f9780e0

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.28.0.dev20221203124204-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.28.0.dev20221203124204-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 ec1c5e62fdadafd859056b79d60450a29b8927ca6ae16b0562682e68ba9b9de4
MD5 ef6825da415a8e68dfbd41b87960923f
BLAKE2b-256 78f15e6929db95e9dd518d6ceb5d888bc0d51914b720b69cea1101dbb5e0cb71

See more details on using hashes here.

File details

Details for the file tensorflow_io_nightly-0.28.0.dev20221203124204-cp37-cp37m-macosx_10_14_x86_64.whl.

File metadata

File hashes

Hashes for tensorflow_io_nightly-0.28.0.dev20221203124204-cp37-cp37m-macosx_10_14_x86_64.whl
Algorithm Hash digest
SHA256 62c7855230570d81eafd9f47ec07828f81d8d2fc8f26b5b3b996f5821fd205a6
MD5 bdfb55929a094be33160f32485730d68
BLAKE2b-256 becaa4b04a3e3cb6afa35c3506ed17f1ef6cc34a2f12ae54592ed548b1888360

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page