Skip to main content

PyTorch extension for handling deeply nested sequences of variable length

Project description



Tests PyPI Codecov License

FoldedTensor: PyTorch extension for handling deeply nested sequences of variable length

foldedtensor is a PyTorch extension that provides efficient handling of tensors containing deeply nested sequences variable sizes. It enables the flattening/unflattening (or unfolding/folding) of data dimensions based on a inner structure of sequence lengths. This library is particularly useful when working with data that can be split in different ways and enables you to avoid choosing a fixed representation.

Installation

The library can be installed with pip:

pip install foldedtensor

Features

  • Support for arbitrary numbers of nested dimensions
  • No computational overhead when dealing with already padded tensors
  • Dynamic re-padding (or refolding) of data based on stored inner lengths
  • Automatic mask generation and updating whenever the tensor is refolded
  • C++ optimized code for fast data loading from Python lists and refolding
  • Flexibility in data representation, making it easy to switch between different layouts when needed

Example

import torch
from foldedtensor import as_folded_tensor

# Creating a folded tensor from a nested list
# There are 2 samples, the first with 5 lines, the second with 1 line.
# Each line contain between 1 and 2 words.
ft = as_folded_tensor(
    [
        [[1], [], [], [], [2, 3]],
        [[4, 3]],
    ],
    data_dims=("samples", "words"),
    full_names=("samples", "lines", "words"),
    dtype=torch.long,
)
print(ft)
# FoldedTensor([[1, 2, 3],
#               [4, 3, 0]])

# Refold on the lines and words dims (flatten the samples dim)
print(ft.refold(("lines", "words")))
# FoldedTensor([[1, 0],
#               [0, 0],
#               [0, 0],
#               [0, 0],
#               [2, 3],
#               [4, 3]])

# Refold on the words dim only: flatten everything
print(ft.refold(("words",)))
# FoldedTensor([1, 2, 3, 4, 3])

# Working with PyTorch operations
embedder = torch.nn.Embedding(10, 16)
embedding = embedder(ft.refold(("words",)))
print(embedding.shape)
# torch.Size([5, 16]) # 5 words total, 16 dims

refolded_embedding = embedding.refold(("samples", "words"))
print(refolded_embedding.shape)
# torch.Size([2, 5, 16]) # 2 samples, 5 words max, 16 dims

Comparison with alternatives

Unlike other ragged or nested tensor implementations, a FoldedTensor does not enforce a specific structure on the nested data, and does not require padding all dimensions. This provides the user with greater flexibility when working with data that can be arranged in multiple ways depending on the data transformation. Moreover, the C++ optimization ensures high performance, making it ideal for handling deeply nested tensors efficiently.

Here is a comparison with other common implementations for handling nested sequences of variable length:

Feature NestedTensor MaskedTensor FoldedTensor
Inner data structure Flat Padded Arbitrary
Max nesting level 1 1
From nested python lists No No Yes
Layout conversion To padded No Any
Reduction ops w/o padding Yes No No

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

foldedtensor-0.3.0.tar.gz (13.3 kB view details)

Uploaded Source

Built Distributions

foldedtensor-0.3.0-cp310-cp310-win_amd64.whl (71.8 kB view details)

Uploaded CPython 3.10 Windows x86-64

foldedtensor-0.3.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (111.0 kB view details)

Uploaded CPython 3.10 manylinux: glibc 2.17+ x86-64

foldedtensor-0.3.0-cp310-cp310-macosx_10_9_x86_64.whl (78.5 kB view details)

Uploaded CPython 3.10 macOS 10.9+ x86-64

foldedtensor-0.3.0-cp39-cp39-win_amd64.whl (71.6 kB view details)

Uploaded CPython 3.9 Windows x86-64

foldedtensor-0.3.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (110.9 kB view details)

Uploaded CPython 3.9 manylinux: glibc 2.17+ x86-64

foldedtensor-0.3.0-cp39-cp39-macosx_10_9_x86_64.whl (78.5 kB view details)

Uploaded CPython 3.9 macOS 10.9+ x86-64

foldedtensor-0.3.0-cp38-cp38-win_amd64.whl (71.8 kB view details)

Uploaded CPython 3.8 Windows x86-64

foldedtensor-0.3.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (110.5 kB view details)

Uploaded CPython 3.8 manylinux: glibc 2.17+ x86-64

foldedtensor-0.3.0-cp38-cp38-macosx_10_9_x86_64.whl (78.4 kB view details)

Uploaded CPython 3.8 macOS 10.9+ x86-64

foldedtensor-0.3.0-cp37-cp37m-win_amd64.whl (72.2 kB view details)

Uploaded CPython 3.7m Windows x86-64

foldedtensor-0.3.0-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (111.9 kB view details)

Uploaded CPython 3.7m manylinux: glibc 2.17+ x86-64

foldedtensor-0.3.0-cp37-cp37m-macosx_10_9_x86_64.whl (78.4 kB view details)

Uploaded CPython 3.7m macOS 10.9+ x86-64

File details

Details for the file foldedtensor-0.3.0.tar.gz.

File metadata

  • Download URL: foldedtensor-0.3.0.tar.gz
  • Upload date:
  • Size: 13.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/4.0.1 CPython/3.11.4

File hashes

Hashes for foldedtensor-0.3.0.tar.gz
Algorithm Hash digest
SHA256 5216aa3ce9c378c021af7ecac47ac6fd5514a14987cab5eed9e34e5059dda46b
MD5 d59ffb12818efc7917ff02c4ead68ab8
BLAKE2b-256 a54fdd319ede58dc0a69f207337748be51b81699ec04a06567deacfb0d5aeb20

See more details on using hashes here.

Provenance

File details

Details for the file foldedtensor-0.3.0-cp310-cp310-win_amd64.whl.

File metadata

File hashes

Hashes for foldedtensor-0.3.0-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 44b9505d91f7692a435be43f6e85c9011f80d472b7437bd8899ee1d7792a88d7
MD5 cfedb69c3f046bde4220d18b2ed25d44
BLAKE2b-256 675dc6162c6051bcd9ad4ea18dcc301444e3398f2034c41689a724c0744d3ec7

See more details on using hashes here.

Provenance

File details

Details for the file foldedtensor-0.3.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for foldedtensor-0.3.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 602cf8ffd1b0818e2f404148daf8dab9e353b3546f5986b27bbe391a14cb7291
MD5 7f0c678a6042830ba009533a48203a18
BLAKE2b-256 e7fce8a24ebc0e692494d3a2f420c2ad98423d919a2daa3317d9f38b927feb42

See more details on using hashes here.

Provenance

File details

Details for the file foldedtensor-0.3.0-cp310-cp310-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for foldedtensor-0.3.0-cp310-cp310-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 74621dfdf67af692cc55b0e935393680dd8bab06aa21dd284d45012085711187
MD5 2b80b0cc6efcec53e505f675750ad611
BLAKE2b-256 13ed12722fad05dbadb81e78f9bce53fb35407494bdf7a60de70e66dbf299f21

See more details on using hashes here.

Provenance

File details

Details for the file foldedtensor-0.3.0-cp39-cp39-win_amd64.whl.

File metadata

File hashes

Hashes for foldedtensor-0.3.0-cp39-cp39-win_amd64.whl
Algorithm Hash digest
SHA256 1993649a54e398b34bb41c12a68af787e8f0bdd0a32e20c8479785041632287b
MD5 919a3f1a718cee1ae93024662fcc99ec
BLAKE2b-256 5a719e2acdfa0f5c534c29ccac9d964091de3b12a4bd1fc2fbe32045f3ee783e

See more details on using hashes here.

Provenance

File details

Details for the file foldedtensor-0.3.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for foldedtensor-0.3.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 1f435fdda768c68353cbd1aaa78b78a08ff38b74e6afcf7cab0b4b216c170d3c
MD5 64c54e7682d5df4e8042a7754e1366ea
BLAKE2b-256 0769a0ec104f3fce8d433decd3241c0a9944a99e0534aa812335b9eeb6da1115

See more details on using hashes here.

Provenance

File details

Details for the file foldedtensor-0.3.0-cp39-cp39-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for foldedtensor-0.3.0-cp39-cp39-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 a9a4286caef3d0f90036441474cecd46dcb6a28f3a35660879bacd0ef2a13033
MD5 c220377bc46ae5f10ff154b025fb857c
BLAKE2b-256 46686a44f5df690c20ffc27226a1b522d97d8d109e50b27af6171d6c30899d37

See more details on using hashes here.

Provenance

File details

Details for the file foldedtensor-0.3.0-cp38-cp38-win_amd64.whl.

File metadata

File hashes

Hashes for foldedtensor-0.3.0-cp38-cp38-win_amd64.whl
Algorithm Hash digest
SHA256 5cbd9994e759bf150526ca98e8a2686111536ac4108caf7962fabc494081c7f5
MD5 d313a1758dcb874e7556f747890b3f22
BLAKE2b-256 996a16823caf114c0f91cef65ee8d9d52b16124ba32b8579b68d7f0444a752b5

See more details on using hashes here.

Provenance

File details

Details for the file foldedtensor-0.3.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for foldedtensor-0.3.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 cdbc85db99490c53112741d578565e06ed41768db3c8660c09ea530ba52c0998
MD5 047e6591dd5521c7d193b69e3ac9085e
BLAKE2b-256 5c0b26110a9b8cea7e3412b2629a8157d2a68ebc5e96861a92caff2d7b5bb5fb

See more details on using hashes here.

Provenance

File details

Details for the file foldedtensor-0.3.0-cp38-cp38-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for foldedtensor-0.3.0-cp38-cp38-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 8a6c380a705ef420fe36ab6b877938b89c7ca5d9c48fc1706caff86241694cfe
MD5 b50c13420e49b4f3ccfda68883f4fec3
BLAKE2b-256 d909bd5aa98809224922f9a8175185504260a4890f6eebc8fd717f1b71f73325

See more details on using hashes here.

Provenance

File details

Details for the file foldedtensor-0.3.0-cp37-cp37m-win_amd64.whl.

File metadata

File hashes

Hashes for foldedtensor-0.3.0-cp37-cp37m-win_amd64.whl
Algorithm Hash digest
SHA256 cb0e4b58b068210aec7e2e3d4973d042e17ba787dfb41458c6d9589f0374261b
MD5 99410e5dd8dc43878b85c6f5ddc87b5d
BLAKE2b-256 61ca0c70e4f2e0989c27e94bc35c98393ba395284cb039d4b3b4a3b6605a0c22

See more details on using hashes here.

Provenance

File details

Details for the file foldedtensor-0.3.0-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for foldedtensor-0.3.0-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 87d7b8eebdf086e0a7b36eb88dd2ffff5eb4432ecd79ebd7a3fd271ba70206e4
MD5 b6392742b254cb741faa33509775399a
BLAKE2b-256 cc3126232bac054cfad9664323ace94ded0ae0207101c2ef5ec1d129aade45ae

See more details on using hashes here.

Provenance

File details

Details for the file foldedtensor-0.3.0-cp37-cp37m-macosx_10_9_x86_64.whl.

File metadata

File hashes

Hashes for foldedtensor-0.3.0-cp37-cp37m-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 abe091d57f4d7b9f33af50509a9dba406a46376417c55bb1fe23b10378e6ea5c
MD5 58af6a274790d93dcd6a013218a5745d
BLAKE2b-256 60af4ac3a99e57d3592cf52c15cb2e671fd8656769d46a222a3746b25bbf584f

See more details on using hashes here.

Provenance

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page