Skip to main content

PyTorch extension for handling deeply nested sequences of variable length

Project description


FoldedTensor: PyTorch extension for handling deeply nested sequences of variable length

foldedtensor is a PyTorch extension that provides efficient handling of tensors containing deeply nested sequences variable sizes. It enables the flattening/unflattening (or unfolding/folding) of data dimensions based on a inner structure of sequence lengths. This library is particularly useful when working with data that can be split in different ways and enables you to avoid choosing a fixed representation.

Installation

The library can be installed with pip:

pip install foldedtensor

Features

  • Support for arbitrary numbers of nested dimensions
  • No computational overhead when dealing with already padded tensors
  • Dynamic re-padding (or refolding) of data based on stored inner lengths
  • Automatic mask generation and updating whenever the tensor is refolded
  • C++ optimized code for fast data loading from Python lists and refolding
  • Flexibility in data representation, making it easy to switch between different layouts when needed

Example

import torch
from foldedtensor import as_folded_tensor

# Creating a folded tensor from a nested list
# There are 2 samples, the first with 5 lines, the second with 1 line.
# Each line contain between 1 and 2 words.
ft = as_folded_tensor(
    [
        [[1], [], [], [], [2, 3]],
        [[4, 3]],
    ],
    data_dims=("samples", "words"),
    full_names=("samples", "lines", "words"),
    dtype=torch.long,
)
print(ft)
# FoldedTensor([[1, 2, 3],
#               [4, 3, 0]])

# Refold on the lines and words dims (flatten the samples dim)
print(ft.refold(("lines", "words")))
# FoldedTensor([[1, 0],
#               [0, 0],
#               [0, 0],
#               [0, 0],
#               [2, 3],
#               [4, 3]])

# Refold on the words dim only: flatten everything
print(ft.refold(("words",)))
# FoldedTensor([1, 2, 3, 4, 3])

# Working with PyTorch operations
embedder = torch.nn.Embedding(10, 16)
embedding = embedder(ft.refold(("words",)))
print(embedding.shape)
# torch.Size([5, 16]) # 5 words total, 16 dims

refolded_embedding = embedding.refold(("samples", "words"))
print(refolded_embedding.shape)
# torch.Size([2, 5, 16]) # 2 samples, 5 words max, 16 dims

Comparison with alternatives

Unlike other ragged or nested tensor implementations, a FoldedTensor does not enforce a specific structure on the nested data, and does not require padding all dimensions. This provides the user with greater flexibility when working with data that can be arranged in multiple ways depending on the data transformation. Moreover, the C++ optimization ensures high performance, making it ideal for handling deeply nested tensors efficiently.

Here is a comparison with other common implementations for handling nested sequences of variable length:

Feature NestedTensor MaskedTensor FoldedTensor
Inner data structure Flat Padded Arbitrary
Max nesting level 1 1
From nested python lists No No Yes
Layout conversion To padded No Any
Reduction ops w/o padding Yes No No

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

foldedtensor-0.2.2.tar.gz (12.5 kB view details)

Uploaded Source

Built Distributions

foldedtensor-0.2.2-cp310-cp310-win_amd64.whl (70.6 kB view details)

Uploaded CPython 3.10 Windows x86-64

foldedtensor-0.2.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (110.1 kB view details)

Uploaded CPython 3.10 manylinux: glibc 2.17+ x86-64

foldedtensor-0.2.2-cp310-cp310-macosx_10_9_x86_64.whl (77.4 kB view details)

Uploaded CPython 3.10 macOS 10.9+ x86-64

foldedtensor-0.2.2-cp39-cp39-win_amd64.whl (70.4 kB view details)

Uploaded CPython 3.9 Windows x86-64

foldedtensor-0.2.2-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (109.9 kB view details)

Uploaded CPython 3.9 manylinux: glibc 2.17+ x86-64

foldedtensor-0.2.2-cp39-cp39-macosx_10_9_x86_64.whl (77.4 kB view details)

Uploaded CPython 3.9 macOS 10.9+ x86-64

foldedtensor-0.2.2-cp38-cp38-win_amd64.whl (70.5 kB view details)

Uploaded CPython 3.8 Windows x86-64

foldedtensor-0.2.2-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (109.7 kB view details)

Uploaded CPython 3.8 manylinux: glibc 2.17+ x86-64

foldedtensor-0.2.2-cp38-cp38-macosx_10_9_x86_64.whl (77.3 kB view details)

Uploaded CPython 3.8 macOS 10.9+ x86-64

foldedtensor-0.2.2-cp37-cp37m-win_amd64.whl (71.0 kB view details)

Uploaded CPython 3.7m Windows x86-64

foldedtensor-0.2.2-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (110.9 kB view details)

Uploaded CPython 3.7m manylinux: glibc 2.17+ x86-64

foldedtensor-0.2.2-cp37-cp37m-macosx_10_9_x86_64.whl (77.3 kB view details)

Uploaded CPython 3.7m macOS 10.9+ x86-64

File details

Details for the file foldedtensor-0.2.2.tar.gz.

File metadata

  • Download URL: foldedtensor-0.2.2.tar.gz
  • Upload date:
  • Size: 12.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/4.0.1 CPython/3.11.3

File hashes

Hashes for foldedtensor-0.2.2.tar.gz
Algorithm Hash digest
SHA256 4144a6cf0406cf48d8393e9c7b859294981f379cb1a47a32c335fd70e4e6aac3
MD5 485d49103dc4759d686fdf0fe6004fc1
BLAKE2b-256 bac679989def4067b38d9eee9a81e04be3bd923eb4aa7da59764879bff2b652e

See more details on using hashes here.

Provenance

File details

Details for the file foldedtensor-0.2.2-cp310-cp310-win_amd64.whl.

File metadata

File hashes

Hashes for foldedtensor-0.2.2-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 eedabbcfe92caf5c0eff428c1c39845d5e23b08f91da7b03818cf965c835ff6a
MD5 c4ad3c2612ab8b401e47c756da2dc22f
BLAKE2b-256 199928e9d76cdc36977331d5c446ae4e5c29701b931b2baf8bdfb6727c712b31

See more details on using hashes here.

Provenance

File details

Details for the file foldedtensor-0.2.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for foldedtensor-0.2.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 f2c975a565f8b0ea98e661724ce30b9262fd17612cf7d1bff72931cba83f3b85
MD5 9438940809a958d7a482830f289993f2
BLAKE2b-256 a4b64ead485c19b175b9b5f9a44d89684367f0b1ac0b66766378ba2a20ecdd53

See more details on using hashes here.

Provenance

File details

Details for the file foldedtensor-0.2.2-cp310-cp310-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for foldedtensor-0.2.2-cp310-cp310-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 6b10c99d9802740e1a7480dd4376ee7c282bd37cf0684f770d1d5600540d68b1
MD5 7333ac14b1ac4ec0d8cd5ae375be6726
BLAKE2b-256 95e489b4a436c995830504b5876c0ab9e22dfbac354b7d4bb878d1686b300ab0

See more details on using hashes here.

Provenance

File details

Details for the file foldedtensor-0.2.2-cp39-cp39-win_amd64.whl.

File metadata

File hashes

Hashes for foldedtensor-0.2.2-cp39-cp39-win_amd64.whl
Algorithm Hash digest
SHA256 e643953d8e2730eea9c4c3d10163fb6ab277dd49e51f9cf1f22c22180ab27af9
MD5 870fa62b2b69b9319d66853ab6025bc9
BLAKE2b-256 3ee06419f1ade8709f79bdbe5c5ba3740b9abc112ddac06c68ab2d40a8461cf8

See more details on using hashes here.

Provenance

File details

Details for the file foldedtensor-0.2.2-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for foldedtensor-0.2.2-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 21ff4310b21d94579b1c4e4b0acecd1cc1378cb6c5ad43c46ab5f4fd34cbf7ec
MD5 c376c06aa5cbe2647ade878f4b2c4d4a
BLAKE2b-256 22a9b0b6ae89fbce77abfc1948a837767f1b8ef5ac99aa2a311b9bed5b3d401a

See more details on using hashes here.

Provenance

File details

Details for the file foldedtensor-0.2.2-cp39-cp39-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for foldedtensor-0.2.2-cp39-cp39-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 6655bf40614d058a4bd0a17f2e9c8012409b2225c612fdf593644b5f20247bb7
MD5 9ead19fff4d712789bbacf97cf99fd54
BLAKE2b-256 53c9f12e976d3e91d1d27d23eb2e100f15cbedf35505c038a1cfe62fc1467d18

See more details on using hashes here.

Provenance

File details

Details for the file foldedtensor-0.2.2-cp38-cp38-win_amd64.whl.

File metadata

File hashes

Hashes for foldedtensor-0.2.2-cp38-cp38-win_amd64.whl
Algorithm Hash digest
SHA256 530ef5eff7759bfc6cde9e4c07905b37ae059631cc31f03d0b9088e2253c1413
MD5 f1de214902c37f487282e2c514a4d482
BLAKE2b-256 cd54d148ce67107bdfe62a5de3d423a831788ad217a32551842f63bd4996d445

See more details on using hashes here.

Provenance

File details

Details for the file foldedtensor-0.2.2-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for foldedtensor-0.2.2-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 c96c8a95a405c348b2b901f4120d7ef5030bf5ec4b1cc4b6af7ba11a761e74e1
MD5 a63b5c8a2607544cda5ef1634e30e382
BLAKE2b-256 19645ef8904a014a0c849a03a54ceede9ee45b50f8e61f6afe79e6d12e60f49a

See more details on using hashes here.

Provenance

File details

Details for the file foldedtensor-0.2.2-cp38-cp38-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for foldedtensor-0.2.2-cp38-cp38-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 7fe3d97ec56bdeda26abd265188972fccea97e4b29acefdb4a0aca99d463fb6d
MD5 8df80ebd9f67d0ef0876c38c320c2006
BLAKE2b-256 afa0a154fb3aaf0d55f0168943f59837a9f7dedf2337532dd854a73239313462

See more details on using hashes here.

Provenance

File details

Details for the file foldedtensor-0.2.2-cp37-cp37m-win_amd64.whl.

File metadata

File hashes

Hashes for foldedtensor-0.2.2-cp37-cp37m-win_amd64.whl
Algorithm Hash digest
SHA256 509718c846a7e0768c997a0c95be82ffa99e7d167a5f8a964b30603996d87eff
MD5 ba6ee9b432678890b5fbad85f6b24970
BLAKE2b-256 d4d8c48365a9fc5871cd76a500bd7e9c7f140370817e30e5dbd38c99f1a95aa0

See more details on using hashes here.

Provenance

File details

Details for the file foldedtensor-0.2.2-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for foldedtensor-0.2.2-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 4c1e1a405c1594060e449f31659f46a09e990cb10873dcf8b02d5bfaa3b1e915
MD5 98e2311f1e920c7e34090604d4b77ea2
BLAKE2b-256 4b92535c4961859b57c15094ed4bba8daf35ae89a488ade24c0e92a5dc4f4feb

See more details on using hashes here.

Provenance

File details

Details for the file foldedtensor-0.2.2-cp37-cp37m-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for foldedtensor-0.2.2-cp37-cp37m-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 65b73c111f4f51fc7c7c5d2fc97c69afae2efa29de44a1a1f447185595e6d474
MD5 7f3df1092c17dea072f2fd4be495c359
BLAKE2b-256 91a34d39c62e672056cb01151f8a68941505ba174014aa94365f9603cd494e49

See more details on using hashes here.

Provenance

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page