Skip to main content

Fast and Customizable Tokenizers

Project description



Build GitHub


Tokenizers

Provides an implementation of today's most used tokenizers, with a focus on performance and versatility.

Bindings over the Rust implementation. If you are interested in the High-level design, you can go check it there.

Otherwise, let's dive in!

Main features:

  • Train new vocabularies and tokenize using 4 pre-made tokenizers (Bert WordPiece and the 3 most common BPE versions).
  • Extremely fast (both training and tokenization), thanks to the Rust implementation. Takes less than 20 seconds to tokenize a GB of text on a server's CPU.
  • Easy to use, but also extremely versatile.
  • Designed for research and production.
  • Normalization comes with alignments tracking. It's always possible to get the part of the original sentence that corresponds to a given token.
  • Does all the pre-processing: Truncate, Pad, add the special tokens your model needs.

Installation

With pip:

pip install tokenizers

From sources:

To use this method, you need to have the Rust installed:

# Install with:
curl https://sh.rustup.rs -sSf | sh -s -- -y
export PATH="$HOME/.cargo/bin:$PATH"

Once Rust is installed, you can compile doing the following

git clone https://github.com/huggingface/tokenizers
cd tokenizers/bindings/python

# Create a virtual env (you can use yours as well)
python -m venv .env
source .env/bin/activate

# Install `tokenizers` in the current virtual env
pip install setuptools_rust
python setup.py install

Load a pretrained tokenizer from the Hub

from tokenizers import Tokenizer

tokenizer = Tokenizer.from_pretrained("bert-base-cased")

Using the provided Tokenizers

We provide some pre-build tokenizers to cover the most common cases. You can easily load one of these using some vocab.json and merges.txt files:

from tokenizers import CharBPETokenizer

# Initialize a tokenizer
vocab = "./path/to/vocab.json"
merges = "./path/to/merges.txt"
tokenizer = CharBPETokenizer(vocab, merges)

# And then encode:
encoded = tokenizer.encode("I can feel the magic, can you?")
print(encoded.ids)
print(encoded.tokens)

And you can train them just as simply:

from tokenizers import CharBPETokenizer

# Initialize a tokenizer
tokenizer = CharBPETokenizer()

# Then train it!
tokenizer.train([ "./path/to/files/1.txt", "./path/to/files/2.txt" ])

# Now, let's use it:
encoded = tokenizer.encode("I can feel the magic, can you?")

# And finally save it somewhere
tokenizer.save("./path/to/directory/my-bpe.tokenizer.json")

Provided Tokenizers

  • CharBPETokenizer: The original BPE
  • ByteLevelBPETokenizer: The byte level version of the BPE
  • SentencePieceBPETokenizer: A BPE implementation compatible with the one used by SentencePiece
  • BertWordPieceTokenizer: The famous Bert tokenizer, using WordPiece

All of these can be used and trained as explained above!

Build your own

Whenever these provided tokenizers don't give you enough freedom, you can build your own tokenizer, by putting all the different parts you need together. You can check how we implemented the provided tokenizers and adapt them easily to your own needs.

Building a byte-level BPE

Here is an example showing how to build your own byte-level BPE by putting all the different pieces together, and then saving it to a single file:

from tokenizers import Tokenizer, models, pre_tokenizers, decoders, trainers, processors

# Initialize a tokenizer
tokenizer = Tokenizer(models.BPE())

# Customize pre-tokenization and decoding
tokenizer.pre_tokenizer = pre_tokenizers.ByteLevel(add_prefix_space=True)
tokenizer.decoder = decoders.ByteLevel()
tokenizer.post_processor = processors.ByteLevel(trim_offsets=True)

# And then train
trainer = trainers.BpeTrainer(
    vocab_size=20000,
    min_frequency=2,
    initial_alphabet=pre_tokenizers.ByteLevel.alphabet()
)
tokenizer.train([
    "./path/to/dataset/1.txt",
    "./path/to/dataset/2.txt",
    "./path/to/dataset/3.txt"
], trainer=trainer)

# And Save it
tokenizer.save("byte-level-bpe.tokenizer.json", pretty=True)

Now, when you want to use this tokenizer, this is as simple as:

from tokenizers import Tokenizer

tokenizer = Tokenizer.from_file("byte-level-bpe.tokenizer.json")

encoded = tokenizer.encode("I can feel the magic, can you?")

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tokenizers-0.13.3.tar.gz (314.9 kB view details)

Uploaded Source

Built Distributions

tokenizers-0.13.3-cp311-cp311-win_amd64.whl (3.5 MB view details)

Uploaded CPython 3.11 Windows x86-64

tokenizers-0.13.3-cp311-cp311-win32.whl (3.2 MB view details)

Uploaded CPython 3.11 Windows x86

tokenizers-0.13.3-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (7.8 MB view details)

Uploaded CPython 3.11 manylinux: glibc 2.17+ x86-64

tokenizers-0.13.3-cp311-cp311-manylinux_2_17_s390x.manylinux2014_s390x.whl (8.2 MB view details)

Uploaded CPython 3.11 manylinux: glibc 2.17+ s390x

tokenizers-0.13.3-cp311-cp311-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl (8.6 MB view details)

Uploaded CPython 3.11 manylinux: glibc 2.17+ ppc64le

tokenizers-0.13.3-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (7.5 MB view details)

Uploaded CPython 3.11 manylinux: glibc 2.17+ ARM64

tokenizers-0.13.3-cp311-cp311-macosx_12_0_arm64.whl (3.9 MB view details)

Uploaded CPython 3.11 macOS 12.0+ ARM64

tokenizers-0.13.3-cp311-cp311-macosx_10_11_universal2.whl (4.0 MB view details)

Uploaded CPython 3.11 macOS 10.11+ universal2 (ARM64, x86-64)

tokenizers-0.13.3-cp310-cp310-win_amd64.whl (3.5 MB view details)

Uploaded CPython 3.10 Windows x86-64

tokenizers-0.13.3-cp310-cp310-win32.whl (3.2 MB view details)

Uploaded CPython 3.10 Windows x86

tokenizers-0.13.3-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (7.8 MB view details)

Uploaded CPython 3.10 manylinux: glibc 2.17+ x86-64

tokenizers-0.13.3-cp310-cp310-manylinux_2_17_s390x.manylinux2014_s390x.whl (8.2 MB view details)

Uploaded CPython 3.10 manylinux: glibc 2.17+ s390x

tokenizers-0.13.3-cp310-cp310-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl (8.6 MB view details)

Uploaded CPython 3.10 manylinux: glibc 2.17+ ppc64le

tokenizers-0.13.3-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (7.5 MB view details)

Uploaded CPython 3.10 manylinux: glibc 2.17+ ARM64

tokenizers-0.13.3-cp310-cp310-macosx_12_0_arm64.whl (3.9 MB view details)

Uploaded CPython 3.10 macOS 12.0+ ARM64

tokenizers-0.13.3-cp310-cp310-macosx_10_11_x86_64.whl (4.0 MB view details)

Uploaded CPython 3.10 macOS 10.11+ x86-64

tokenizers-0.13.3-cp39-cp39-win_amd64.whl (3.5 MB view details)

Uploaded CPython 3.9 Windows x86-64

tokenizers-0.13.3-cp39-cp39-win32.whl (3.2 MB view details)

Uploaded CPython 3.9 Windows x86

tokenizers-0.13.3-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (7.8 MB view details)

Uploaded CPython 3.9 manylinux: glibc 2.17+ x86-64

tokenizers-0.13.3-cp39-cp39-manylinux_2_17_s390x.manylinux2014_s390x.whl (8.2 MB view details)

Uploaded CPython 3.9 manylinux: glibc 2.17+ s390x

tokenizers-0.13.3-cp39-cp39-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl (8.6 MB view details)

Uploaded CPython 3.9 manylinux: glibc 2.17+ ppc64le

tokenizers-0.13.3-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (7.5 MB view details)

Uploaded CPython 3.9 manylinux: glibc 2.17+ ARM64

tokenizers-0.13.3-cp39-cp39-macosx_12_0_arm64.whl (3.9 MB view details)

Uploaded CPython 3.9 macOS 12.0+ ARM64

tokenizers-0.13.3-cp39-cp39-macosx_10_11_x86_64.whl (4.0 MB view details)

Uploaded CPython 3.9 macOS 10.11+ x86-64

tokenizers-0.13.3-cp38-cp38-win_amd64.whl (3.5 MB view details)

Uploaded CPython 3.8 Windows x86-64

tokenizers-0.13.3-cp38-cp38-win32.whl (3.2 MB view details)

Uploaded CPython 3.8 Windows x86

tokenizers-0.13.3-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (7.8 MB view details)

Uploaded CPython 3.8 manylinux: glibc 2.17+ x86-64

tokenizers-0.13.3-cp38-cp38-manylinux_2_17_s390x.manylinux2014_s390x.whl (8.2 MB view details)

Uploaded CPython 3.8 manylinux: glibc 2.17+ s390x

tokenizers-0.13.3-cp38-cp38-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl (8.6 MB view details)

Uploaded CPython 3.8 manylinux: glibc 2.17+ ppc64le

tokenizers-0.13.3-cp38-cp38-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (7.5 MB view details)

Uploaded CPython 3.8 manylinux: glibc 2.17+ ARM64

tokenizers-0.13.3-cp38-cp38-macosx_12_0_arm64.whl (3.9 MB view details)

Uploaded CPython 3.8 macOS 12.0+ ARM64

tokenizers-0.13.3-cp38-cp38-macosx_10_11_x86_64.whl (4.0 MB view details)

Uploaded CPython 3.8 macOS 10.11+ x86-64

tokenizers-0.13.3-cp37-cp37m-win_amd64.whl (3.5 MB view details)

Uploaded CPython 3.7m Windows x86-64

tokenizers-0.13.3-cp37-cp37m-win32.whl (3.2 MB view details)

Uploaded CPython 3.7m Windows x86

tokenizers-0.13.3-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (7.8 MB view details)

Uploaded CPython 3.7m manylinux: glibc 2.17+ x86-64

tokenizers-0.13.3-cp37-cp37m-manylinux_2_17_s390x.manylinux2014_s390x.whl (8.2 MB view details)

Uploaded CPython 3.7m manylinux: glibc 2.17+ s390x

tokenizers-0.13.3-cp37-cp37m-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl (8.6 MB view details)

Uploaded CPython 3.7m manylinux: glibc 2.17+ ppc64le

tokenizers-0.13.3-cp37-cp37m-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (7.5 MB view details)

Uploaded CPython 3.7m manylinux: glibc 2.17+ ARM64

tokenizers-0.13.3-cp37-cp37m-macosx_10_11_x86_64.whl (4.0 MB view details)

Uploaded CPython 3.7m macOS 10.11+ x86-64

File details

Details for the file tokenizers-0.13.3.tar.gz.

File metadata

  • Download URL: tokenizers-0.13.3.tar.gz
  • Upload date:
  • Size: 314.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.2

File hashes

Hashes for tokenizers-0.13.3.tar.gz
Algorithm Hash digest
SHA256 2e546dbb68b623008a5442353137fbb0123d311a6d7ba52f2667c8862a75af2e
MD5 47734c88552962e3b28a1b3705f6d32b
BLAKE2b-256 299c936ebad6dd963616189d6362f4c2c03a0314cf2a221ba15e48dd714d29cf

See more details on using hashes here.

File details

Details for the file tokenizers-0.13.3-cp311-cp311-win_amd64.whl.

File metadata

File hashes

Hashes for tokenizers-0.13.3-cp311-cp311-win_amd64.whl
Algorithm Hash digest
SHA256 ecf182bf59bd541a8876deccf0360f5ae60496fd50b58510048020751cf1724c
MD5 b7fa53a1055e278499049e0b1b2dba46
BLAKE2b-256 624193d3135ec30f596a71490ce11a73572190fe80e85a2aea18f116a520cc41

See more details on using hashes here.

File details

Details for the file tokenizers-0.13.3-cp311-cp311-win32.whl.

File metadata

File hashes

Hashes for tokenizers-0.13.3-cp311-cp311-win32.whl
Algorithm Hash digest
SHA256 cc78d77f597d1c458bf0ea7c2a64b6aa06941c7a99cb135b5969b0278824d808
MD5 4e40b81e1bcea0817315119a87c752b6
BLAKE2b-256 856d5401879fc61526a48c20576ea85b8c6acfc0a67afdec18353522380e2e44

See more details on using hashes here.

File details

Details for the file tokenizers-0.13.3-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for tokenizers-0.13.3-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 9ba2b0bf01777c9b9bc94b53764d6684554ce98551fec496f71bc5be3a03e98b
MD5 30b2d4b086d77a548c14d33e06f50007
BLAKE2b-256 e9d9660f08ecd88bed54df9f85cab44e01184733c8e42d79b583fb985f1dc412

See more details on using hashes here.

File details

Details for the file tokenizers-0.13.3-cp311-cp311-manylinux_2_17_s390x.manylinux2014_s390x.whl.

File metadata

File hashes

Hashes for tokenizers-0.13.3-cp311-cp311-manylinux_2_17_s390x.manylinux2014_s390x.whl
Algorithm Hash digest
SHA256 4b3e3215d048e94f40f1c95802e45dcc37c5b05eb46280fc2ccc8cd351bff839
MD5 5b40ad5453efd0b266c7a382251970eb
BLAKE2b-256 928574b09728fc3a34991595b103fb5d2c00801e2106f57b4306a4beab39354d

See more details on using hashes here.

File details

Details for the file tokenizers-0.13.3-cp311-cp311-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl.

File metadata

File hashes

Hashes for tokenizers-0.13.3-cp311-cp311-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl
Algorithm Hash digest
SHA256 f247eae99800ef821a91f47c5280e9e9afaeed9980fc444208d5aa6ba69ff148
MD5 32ec9d9fa756c6fcf8dc2f34cf7ce40a
BLAKE2b-256 a6cb5ea65756be1cd97582fa0d3e466d20e041cf7e21ae47e5032a2b0c8a86db

See more details on using hashes here.

File details

Details for the file tokenizers-0.13.3-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for tokenizers-0.13.3-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 f55c981ac44ba87c93e847c333e58c12abcbb377a0c2f2ef96e1a266e4184ff2
MD5 7c13b95cf3cc971ad7d6fb24da0334b4
BLAKE2b-256 096c68f532b76033b7ef745777955ad0645069baa80cc0e47363432817323c53

See more details on using hashes here.

File details

Details for the file tokenizers-0.13.3-cp311-cp311-macosx_12_0_arm64.whl.

File metadata

File hashes

Hashes for tokenizers-0.13.3-cp311-cp311-macosx_12_0_arm64.whl
Algorithm Hash digest
SHA256 cc5c022ce692e1f499d745af293ab9ee6f5d92538ed2faf73f9708c89ee59ce6
MD5 ff45cf44db154f261f7c8d12f5c37fc5
BLAKE2b-256 0ce0f51b2d52fcc2c64e0b81da0a1c68d57b3859212143dbc64b0d175ed78693

See more details on using hashes here.

File details

Details for the file tokenizers-0.13.3-cp311-cp311-macosx_10_11_universal2.whl.

File metadata

File hashes

Hashes for tokenizers-0.13.3-cp311-cp311-macosx_10_11_universal2.whl
Algorithm Hash digest
SHA256 56b726e0d2bbc9243872b0144515ba684af5b8d8cd112fb83ee1365e26ec74c8
MD5 c15cbb07bbe55e1ff89f564ef7e6b31e
BLAKE2b-256 7afe003cb6ed585f1879d0eb5bedf80c0261efc351885027dffefbd86202d463

See more details on using hashes here.

File details

Details for the file tokenizers-0.13.3-cp310-cp310-win_amd64.whl.

File metadata

File hashes

Hashes for tokenizers-0.13.3-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 89649c00d0d7211e8186f7a75dfa1db6996f65edce4b84821817eadcc2d3c79e
MD5 cb0de12d90862aa7443f7853064cbdee
BLAKE2b-256 3de7ea70a1a0cd9ad5b4786365edd4042c30037bc4e67ae09c53cd62c6c1b89f

See more details on using hashes here.

File details

Details for the file tokenizers-0.13.3-cp310-cp310-win32.whl.

File metadata

File hashes

Hashes for tokenizers-0.13.3-cp310-cp310-win32.whl
Algorithm Hash digest
SHA256 1f0e3b4c2ea2cd13238ce43548959c118069db7579e5d40ec270ad77da5833ce
MD5 3b803686568c5ffbf891aae18e1b2832
BLAKE2b-256 7a27dcdaeb66d65549bd75e44b553b6dfa9fcf12e05a9c4e4d3bc8b499e4d77a

See more details on using hashes here.

File details

Details for the file tokenizers-0.13.3-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for tokenizers-0.13.3-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 a4d53976079cff8a033f778fb9adca2d9d69d009c02fa2d71a878b5f3963ed30
MD5 d2eaad64ed29eea6c9f19fcab125c6a6
BLAKE2b-256 9460ff26cce378023624ffcad91edaa4871f561d6ba7295185c45037ddba80e2

See more details on using hashes here.

File details

Details for the file tokenizers-0.13.3-cp310-cp310-manylinux_2_17_s390x.manylinux2014_s390x.whl.

File metadata

File hashes

Hashes for tokenizers-0.13.3-cp310-cp310-manylinux_2_17_s390x.manylinux2014_s390x.whl
Algorithm Hash digest
SHA256 5ef4215284df1277dadbcc5e17d4882bda19f770d02348e73523f7e7d8b8d396
MD5 c28a992093ad3cd1c90250f3175c6c73
BLAKE2b-256 ee14c9d4f033e5cea6ab19a10dc3b0548939b1a55bcc317a755deb3b802d3e24

See more details on using hashes here.

File details

Details for the file tokenizers-0.13.3-cp310-cp310-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl.

File metadata

File hashes

Hashes for tokenizers-0.13.3-cp310-cp310-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl
Algorithm Hash digest
SHA256 ee0b1b311d65beab83d7a41c56a1e46ab732a9eed4460648e8eb0bd69fc2d059
MD5 46e3b1d29f8db475c0b586568e58d90b
BLAKE2b-256 75b542f10f5889ee2d101019a70fd8da21a8322981198d3c17c069c50c76e739

See more details on using hashes here.

File details

Details for the file tokenizers-0.13.3-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for tokenizers-0.13.3-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 c5fd1a6a25353e9aa762e2aae5a1e63883cad9f4e997c447ec39d071020459bc
MD5 66b3dfd36e1b4d3dc489606a44f64649
BLAKE2b-256 f66c7bf80d4a39832ba612cbe8dd9a579e3f78843848afd6b432062ab9133cd7

See more details on using hashes here.

File details

Details for the file tokenizers-0.13.3-cp310-cp310-macosx_12_0_arm64.whl.

File metadata

File hashes

Hashes for tokenizers-0.13.3-cp310-cp310-macosx_12_0_arm64.whl
Algorithm Hash digest
SHA256 4ef4c3e821730f2692489e926b184321e887f34fb8a6b80b8096b966ba663d07
MD5 f12d24fc7d8784ca72e00b6f1d5a1d46
BLAKE2b-256 70680a450e4dc488031b82fcd869840c542b86aad4a07d0eca1d7e9cbb9d742e

See more details on using hashes here.

File details

Details for the file tokenizers-0.13.3-cp310-cp310-macosx_10_11_x86_64.whl.

File metadata

File hashes

Hashes for tokenizers-0.13.3-cp310-cp310-macosx_10_11_x86_64.whl
Algorithm Hash digest
SHA256 f3835c5be51de8c0a092058a4d4380cb9244fb34681fd0a295fbf0a52a5fdf33
MD5 277271bf0a8a483cf66888227236608b
BLAKE2b-256 89bdcc9c60a26b19ba012b141cba39c1d425994c78eb2458595caab860d7fa66

See more details on using hashes here.

File details

Details for the file tokenizers-0.13.3-cp39-cp39-win_amd64.whl.

File metadata

File hashes

Hashes for tokenizers-0.13.3-cp39-cp39-win_amd64.whl
Algorithm Hash digest
SHA256 bc0a6f1ba036e482db6453571c9e3e60ecd5489980ffd95d11dc9f960483d783
MD5 450859fec320f24e991d2424d8f1a80c
BLAKE2b-256 c922256b4465ef73952d6600e3ceafd8816c5bede14fc1f0d75f393f16155f7b

See more details on using hashes here.

File details

Details for the file tokenizers-0.13.3-cp39-cp39-win32.whl.

File metadata

  • Download URL: tokenizers-0.13.3-cp39-cp39-win32.whl
  • Upload date:
  • Size: 3.2 MB
  • Tags: CPython 3.9, Windows x86
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.2

File hashes

Hashes for tokenizers-0.13.3-cp39-cp39-win32.whl
Algorithm Hash digest
SHA256 48625a108029cb1ddf42e17a81b5a3230ba6888a70c9dc14e81bc319e812652d
MD5 71cb300105ba3a87fca626b2ee15b778
BLAKE2b-256 3934ba18d68195b3a56fdd64dbce253d5d21ca0fa04d205a8763cf390e23329d

See more details on using hashes here.

File details

Details for the file tokenizers-0.13.3-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for tokenizers-0.13.3-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 dd7730c98a3010cd4f523465867ff95cd9d6430db46676ce79358f65ae39797b
MD5 94ddd2ac8626c693654d14b7c2105741
BLAKE2b-256 d62707a337087dd507170a1b20fed3bbf8da81401185a7130a6e74e440c52040

See more details on using hashes here.

File details

Details for the file tokenizers-0.13.3-cp39-cp39-manylinux_2_17_s390x.manylinux2014_s390x.whl.

File metadata

File hashes

Hashes for tokenizers-0.13.3-cp39-cp39-manylinux_2_17_s390x.manylinux2014_s390x.whl
Algorithm Hash digest
SHA256 97acfcec592f7e9de8cadcdcda50a7134423ac8455c0166b28c9ff04d227b371
MD5 08f595215e2a0b72c6bdb169ae143e76
BLAKE2b-256 aafc0e443e0530656b31e572f7fdcaa765a0aa6f0b1cec36b0cd517dc2ee49ee

See more details on using hashes here.

File details

Details for the file tokenizers-0.13.3-cp39-cp39-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl.

File metadata

File hashes

Hashes for tokenizers-0.13.3-cp39-cp39-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl
Algorithm Hash digest
SHA256 280ffe95f50eaaf655b3a1dc7ff1d9cf4777029dbbc3e63a74e65a056594abc3
MD5 574874b1f6994d52411d90e616c9c227
BLAKE2b-256 af165ca34d2f5225d90b02393c4ff4cd03b7ae7c7adf3fa1a96a10e489058adf

See more details on using hashes here.

File details

Details for the file tokenizers-0.13.3-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for tokenizers-0.13.3-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 70ce07445050b537d2696022dafb115307abdffd2a5c106f029490f84501ef97
MD5 c8ee06ec5f900ea697da8c54dbe4c120
BLAKE2b-256 2293a5e4d40b0aac728c23b0c158f1fb858a2dcc88a48bb6bd5917ac2e864d17

See more details on using hashes here.

File details

Details for the file tokenizers-0.13.3-cp39-cp39-macosx_12_0_arm64.whl.

File metadata

File hashes

Hashes for tokenizers-0.13.3-cp39-cp39-macosx_12_0_arm64.whl
Algorithm Hash digest
SHA256 a23ff602d0797cea1d0506ce69b27523b07e70f6dda982ab8cf82402de839088
MD5 5aed4359340727ba7ff91c1f92bfff6e
BLAKE2b-256 ea08fa8ed4b189e184f855cda96e0076d56200b675c2f68c34b674a710985dcf

See more details on using hashes here.

File details

Details for the file tokenizers-0.13.3-cp39-cp39-macosx_10_11_x86_64.whl.

File metadata

File hashes

Hashes for tokenizers-0.13.3-cp39-cp39-macosx_10_11_x86_64.whl
Algorithm Hash digest
SHA256 00cee1e0859d55507e693a48fa4aef07060c4bb6bd93d80120e18fea9371c66d
MD5 6c262bc92aa987d6eab6365659ecd368
BLAKE2b-256 5341a502c196e3b7e582394b28e1690943699f2a570d0663a0d0f333d70561b0

See more details on using hashes here.

File details

Details for the file tokenizers-0.13.3-cp38-cp38-win_amd64.whl.

File metadata

File hashes

Hashes for tokenizers-0.13.3-cp38-cp38-win_amd64.whl
Algorithm Hash digest
SHA256 8e7b0cdeace87fa9e760e6a605e0ae8fc14b7d72e9fc19c578116f7287bb873d
MD5 bb884357c88928858ec6e429e4690b53
BLAKE2b-256 e9b1ff5959f02bdbed4ab07e69f99489c8e5b644d229bc3e7053351725384c51

See more details on using hashes here.

File details

Details for the file tokenizers-0.13.3-cp38-cp38-win32.whl.

File metadata

  • Download URL: tokenizers-0.13.3-cp38-cp38-win32.whl
  • Upload date:
  • Size: 3.2 MB
  • Tags: CPython 3.8, Windows x86
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.2

File hashes

Hashes for tokenizers-0.13.3-cp38-cp38-win32.whl
Algorithm Hash digest
SHA256 9a3fa134896c3c1f0da6e762d15141fbff30d094067c8f1157b9fdca593b5806
MD5 8eca7fb7e711f2d4d07ac2bffe491287
BLAKE2b-256 c64dba54c6453c19e23ef0246ae58023b95e33bf5c8cbf48bae291a667633874

See more details on using hashes here.

File details

Details for the file tokenizers-0.13.3-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for tokenizers-0.13.3-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 a0f9b92ea052305166559f38498b3b0cae159caea712646648aaa272f7160963
MD5 ff1c674e6aaf250462a816d5013a7dea
BLAKE2b-256 4ef2017bf57106b845e31ef6179bf204042720a53629cf599ef9464da990e0e5

See more details on using hashes here.

File details

Details for the file tokenizers-0.13.3-cp38-cp38-manylinux_2_17_s390x.manylinux2014_s390x.whl.

File metadata

File hashes

Hashes for tokenizers-0.13.3-cp38-cp38-manylinux_2_17_s390x.manylinux2014_s390x.whl
Algorithm Hash digest
SHA256 310204dfed5aa797128b65d63538a9837cbdd15da2a29a77d67eefa489edda26
MD5 3a06ffe5942ebaea3255c2c780a6d968
BLAKE2b-256 1b4af6415801536d4101f7ef2a4b3017bd8fbd76765c4b005f8679c87dfad706

See more details on using hashes here.

File details

Details for the file tokenizers-0.13.3-cp38-cp38-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl.

File metadata

File hashes

Hashes for tokenizers-0.13.3-cp38-cp38-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl
Algorithm Hash digest
SHA256 c2f35f30e39e6aab8716f07790f646bdc6e4a853816cc49a95ef2a9016bf9ce6
MD5 449d642780f7814d9601db580767bfd7
BLAKE2b-256 4239bab12bcf09829850ab1c9fa897a2b93e571856e937e73775d512ff6aee0d

See more details on using hashes here.

File details

Details for the file tokenizers-0.13.3-cp38-cp38-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for tokenizers-0.13.3-cp38-cp38-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 3791338f809cd1bf8e4fee6b540b36822434d0c6c6bc47162448deee3f77d425
MD5 e95fe499115567a620533746ed8d08ee
BLAKE2b-256 c7c933ef62b0609fa147268809057f14623b147d7e1a9884cbdb0c906771d569

See more details on using hashes here.

File details

Details for the file tokenizers-0.13.3-cp38-cp38-macosx_12_0_arm64.whl.

File metadata

File hashes

Hashes for tokenizers-0.13.3-cp38-cp38-macosx_12_0_arm64.whl
Algorithm Hash digest
SHA256 d607a6a13718aeb20507bdf2b96162ead5145bbbfa26788d6b833f98b31b26e1
MD5 f6f34cab827af76e38fba00469aa8137
BLAKE2b-256 329294511b2970898be8ee40bd44b9393bff7dcae51b44981e8ce9da48ee3e15

See more details on using hashes here.

File details

Details for the file tokenizers-0.13.3-cp38-cp38-macosx_10_11_x86_64.whl.

File metadata

File hashes

Hashes for tokenizers-0.13.3-cp38-cp38-macosx_10_11_x86_64.whl
Algorithm Hash digest
SHA256 8791dedba834c1fc55e5f1521be325ea3dafb381964be20684b92fdac95d79b7
MD5 ca1d962b7d1b05903f2ddda16f20269b
BLAKE2b-256 0189c08d663466c0d2f270112736cc7e9f59f74d842a6ffcf3c0262184f0707c

See more details on using hashes here.

File details

Details for the file tokenizers-0.13.3-cp37-cp37m-win_amd64.whl.

File metadata

File hashes

Hashes for tokenizers-0.13.3-cp37-cp37m-win_amd64.whl
Algorithm Hash digest
SHA256 fc2a7fdf864554a0dacf09d32e17c0caa9afe72baf9dd7ddedc61973bae352d8
MD5 a1e8b49dc4a01ca7be71196956c8dca4
BLAKE2b-256 90061f3a3a6fb57bf3e72f63cbf0ae0991540065dd6a13393b89761b38634cb0

See more details on using hashes here.

File details

Details for the file tokenizers-0.13.3-cp37-cp37m-win32.whl.

File metadata

  • Download URL: tokenizers-0.13.3-cp37-cp37m-win32.whl
  • Upload date:
  • Size: 3.2 MB
  • Tags: CPython 3.7m, Windows x86
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.2 CPython/3.11.2

File hashes

Hashes for tokenizers-0.13.3-cp37-cp37m-win32.whl
Algorithm Hash digest
SHA256 6cc29d410768f960db8677221e497226e545eaaea01aa3613fa0fdf2cc96cff4
MD5 44337b9619169d4647a6e0b0657428d3
BLAKE2b-256 b97cd2ba27d908bf02f7179a074f2e097c34715a9ba799435b25ac34f8e5f387

See more details on using hashes here.

File details

Details for the file tokenizers-0.13.3-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for tokenizers-0.13.3-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 b8c6e2ab0f2e3d939ca66aa1d596602105fe33b505cd2854a4c1717f704c51de
MD5 a0b726bf14c61eaa975f3d4a8d1f6188
BLAKE2b-256 4d40ab3c3c705e0a8cbbe760c49302b407190201d96fe7dfeea37ccafa004da3

See more details on using hashes here.

File details

Details for the file tokenizers-0.13.3-cp37-cp37m-manylinux_2_17_s390x.manylinux2014_s390x.whl.

File metadata

File hashes

Hashes for tokenizers-0.13.3-cp37-cp37m-manylinux_2_17_s390x.manylinux2014_s390x.whl
Algorithm Hash digest
SHA256 64064bd0322405c9374305ab9b4c07152a1474370327499911937fd4a76d004b
MD5 3ad7fc2809865ee8950b4cda7f95d5a9
BLAKE2b-256 165a18b1bae78b6b70768f6390f354cc28dbfdf9a76d794230fd969df302119f

See more details on using hashes here.

File details

Details for the file tokenizers-0.13.3-cp37-cp37m-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl.

File metadata

File hashes

Hashes for tokenizers-0.13.3-cp37-cp37m-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl
Algorithm Hash digest
SHA256 4560dbdeaae5b7ee0d4e493027e3de6d53c991b5002d7ff95083c99e11dd5ac0
MD5 ec00305b03e238ad4e0338548a66209d
BLAKE2b-256 dd32692178b3e8a5cc5db863873de2eb54be19bb473b2292bb4dcc27d16eb43a

See more details on using hashes here.

File details

Details for the file tokenizers-0.13.3-cp37-cp37m-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for tokenizers-0.13.3-cp37-cp37m-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 07cbb2c307627dc99b44b22ef05ff4473aa7c7cc1fec8f0a8b37d8a64b1a16d2
MD5 4a1dfb70217ea842e50911c8ab98877d
BLAKE2b-256 d8b4ced173b7d9f5f6ab9cc90aea0b571b5cd6776a14ff64bfbf079a5f9e3dc7

See more details on using hashes here.

File details

Details for the file tokenizers-0.13.3-cp37-cp37m-macosx_10_11_x86_64.whl.

File metadata

File hashes

Hashes for tokenizers-0.13.3-cp37-cp37m-macosx_10_11_x86_64.whl
Algorithm Hash digest
SHA256 0527dc5436a1f6bf2c0327da3145687d3bcfbeab91fed8458920093de3901b44
MD5 8bd8223329807d099864ed221c04b43d
BLAKE2b-256 d63179fd1adef658e163398c90028804dcac352a016f5e66dfa55c30d4660753

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page