Skip to main content

No project description provided

Project description



Build GitHub


Tokenizers

Provides an implementation of today's most used tokenizers, with a focus on performance and versatility.

Bindings over the Rust implementation. If you are interested in the High-level design, you can go check it there.

Otherwise, let's dive in!

Main features:

  • Train new vocabularies and tokenize using 4 pre-made tokenizers (Bert WordPiece and the 3 most common BPE versions).
  • Extremely fast (both training and tokenization), thanks to the Rust implementation. Takes less than 20 seconds to tokenize a GB of text on a server's CPU.
  • Easy to use, but also extremely versatile.
  • Designed for research and production.
  • Normalization comes with alignments tracking. It's always possible to get the part of the original sentence that corresponds to a given token.
  • Does all the pre-processing: Truncate, Pad, add the special tokens your model needs.

Installation

With pip:

pip install tokenizers

From sources:

To use this method, you need to have the Rust installed:

# Install with:
curl https://sh.rustup.rs -sSf | sh -s -- -y
export PATH="$HOME/.cargo/bin:$PATH"

Once Rust is installed, you can compile doing the following

git clone https://github.com/huggingface/tokenizers
cd tokenizers/bindings/python

# Create a virtual env (you can use yours as well)
python -m venv .env
source .env/bin/activate

# Install `tokenizers` in the current virtual env
pip install -e .

Load a pretrained tokenizer from the Hub

from tokenizers import Tokenizer

tokenizer = Tokenizer.from_pretrained("bert-base-cased")

Using the provided Tokenizers

We provide some pre-build tokenizers to cover the most common cases. You can easily load one of these using some vocab.json and merges.txt files:

from tokenizers import CharBPETokenizer

# Initialize a tokenizer
vocab = "./path/to/vocab.json"
merges = "./path/to/merges.txt"
tokenizer = CharBPETokenizer(vocab, merges)

# And then encode:
encoded = tokenizer.encode("I can feel the magic, can you?")
print(encoded.ids)
print(encoded.tokens)

And you can train them just as simply:

from tokenizers import CharBPETokenizer

# Initialize a tokenizer
tokenizer = CharBPETokenizer()

# Then train it!
tokenizer.train([ "./path/to/files/1.txt", "./path/to/files/2.txt" ])

# Now, let's use it:
encoded = tokenizer.encode("I can feel the magic, can you?")

# And finally save it somewhere
tokenizer.save("./path/to/directory/my-bpe.tokenizer.json")

Provided Tokenizers

  • CharBPETokenizer: The original BPE
  • ByteLevelBPETokenizer: The byte level version of the BPE
  • SentencePieceBPETokenizer: A BPE implementation compatible with the one used by SentencePiece
  • BertWordPieceTokenizer: The famous Bert tokenizer, using WordPiece

All of these can be used and trained as explained above!

Build your own

Whenever these provided tokenizers don't give you enough freedom, you can build your own tokenizer, by putting all the different parts you need together. You can check how we implemented the provided tokenizers and adapt them easily to your own needs.

Building a byte-level BPE

Here is an example showing how to build your own byte-level BPE by putting all the different pieces together, and then saving it to a single file:

from tokenizers import Tokenizer, models, pre_tokenizers, decoders, trainers, processors

# Initialize a tokenizer
tokenizer = Tokenizer(models.BPE())

# Customize pre-tokenization and decoding
tokenizer.pre_tokenizer = pre_tokenizers.ByteLevel(add_prefix_space=True)
tokenizer.decoder = decoders.ByteLevel()
tokenizer.post_processor = processors.ByteLevel(trim_offsets=True)

# And then train
trainer = trainers.BpeTrainer(
    vocab_size=20000,
    min_frequency=2,
    initial_alphabet=pre_tokenizers.ByteLevel.alphabet()
)
tokenizer.train([
    "./path/to/dataset/1.txt",
    "./path/to/dataset/2.txt",
    "./path/to/dataset/3.txt"
], trainer=trainer)

# And Save it
tokenizer.save("byte-level-bpe.tokenizer.json", pretty=True)

Now, when you want to use this tokenizer, this is as simple as:

from tokenizers import Tokenizer

tokenizer = Tokenizer.from_file("byte-level-bpe.tokenizer.json")

encoded = tokenizer.encode("I can feel the magic, can you?")

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

tokenizers-0.20.0rc1-cp312-none-win_amd64.whl (2.3 MB view details)

Uploaded CPython 3.12 Windows x86-64

tokenizers-0.20.0rc1-cp312-none-win32.whl (2.1 MB view details)

Uploaded CPython 3.12 Windows x86

tokenizers-0.20.0rc1-cp312-cp312-musllinux_1_1_x86_64.whl (9.3 MB view details)

Uploaded CPython 3.12 musllinux: musl 1.1+ x86-64

tokenizers-0.20.0rc1-cp312-cp312-musllinux_1_1_aarch64.whl (9.0 MB view details)

Uploaded CPython 3.12 musllinux: musl 1.1+ ARM64

tokenizers-0.20.0rc1-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (2.9 MB view details)

Uploaded CPython 3.12 manylinux: glibc 2.17+ x86-64

tokenizers-0.20.0rc1-cp312-cp312-manylinux_2_17_s390x.manylinux2014_s390x.whl (3.3 MB view details)

Uploaded CPython 3.12 manylinux: glibc 2.17+ s390x

tokenizers-0.20.0rc1-cp312-cp312-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl (3.0 MB view details)

Uploaded CPython 3.12 manylinux: glibc 2.17+ ppc64le

tokenizers-0.20.0rc1-cp312-cp312-manylinux_2_17_i686.manylinux2014_i686.whl (3.0 MB view details)

Uploaded CPython 3.12 manylinux: glibc 2.17+ i686

tokenizers-0.20.0rc1-cp312-cp312-manylinux_2_17_armv7l.manylinux2014_armv7l.whl (2.8 MB view details)

Uploaded CPython 3.12 manylinux: glibc 2.17+ ARMv7l

tokenizers-0.20.0rc1-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (2.9 MB view details)

Uploaded CPython 3.12 manylinux: glibc 2.17+ ARM64

tokenizers-0.20.0rc1-cp312-cp312-macosx_11_0_arm64.whl (2.5 MB view details)

Uploaded CPython 3.12 macOS 11.0+ ARM64

tokenizers-0.20.0rc1-cp312-cp312-macosx_10_12_x86_64.whl (2.6 MB view details)

Uploaded CPython 3.12 macOS 10.12+ x86-64

tokenizers-0.20.0rc1-cp311-none-win_amd64.whl (2.3 MB view details)

Uploaded CPython 3.11 Windows x86-64

tokenizers-0.20.0rc1-cp311-none-win32.whl (2.1 MB view details)

Uploaded CPython 3.11 Windows x86

tokenizers-0.20.0rc1-cp311-cp311-musllinux_1_1_x86_64.whl (9.3 MB view details)

Uploaded CPython 3.11 musllinux: musl 1.1+ x86-64

tokenizers-0.20.0rc1-cp311-cp311-musllinux_1_1_aarch64.whl (9.0 MB view details)

Uploaded CPython 3.11 musllinux: musl 1.1+ ARM64

tokenizers-0.20.0rc1-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (2.9 MB view details)

Uploaded CPython 3.11 manylinux: glibc 2.17+ x86-64

tokenizers-0.20.0rc1-cp311-cp311-manylinux_2_17_s390x.manylinux2014_s390x.whl (3.3 MB view details)

Uploaded CPython 3.11 manylinux: glibc 2.17+ s390x

tokenizers-0.20.0rc1-cp311-cp311-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl (3.0 MB view details)

Uploaded CPython 3.11 manylinux: glibc 2.17+ ppc64le

tokenizers-0.20.0rc1-cp311-cp311-manylinux_2_17_i686.manylinux2014_i686.whl (3.0 MB view details)

Uploaded CPython 3.11 manylinux: glibc 2.17+ i686

tokenizers-0.20.0rc1-cp311-cp311-manylinux_2_17_armv7l.manylinux2014_armv7l.whl (2.8 MB view details)

Uploaded CPython 3.11 manylinux: glibc 2.17+ ARMv7l

tokenizers-0.20.0rc1-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (2.9 MB view details)

Uploaded CPython 3.11 manylinux: glibc 2.17+ ARM64

tokenizers-0.20.0rc1-cp311-cp311-macosx_11_0_arm64.whl (2.5 MB view details)

Uploaded CPython 3.11 macOS 11.0+ ARM64

tokenizers-0.20.0rc1-cp311-cp311-macosx_10_12_x86_64.whl (2.6 MB view details)

Uploaded CPython 3.11 macOS 10.12+ x86-64

tokenizers-0.20.0rc1-cp310-none-win_amd64.whl (2.3 MB view details)

Uploaded CPython 3.10 Windows x86-64

tokenizers-0.20.0rc1-cp310-none-win32.whl (2.1 MB view details)

Uploaded CPython 3.10 Windows x86

tokenizers-0.20.0rc1-cp310-cp310-musllinux_1_1_x86_64.whl (9.3 MB view details)

Uploaded CPython 3.10 musllinux: musl 1.1+ x86-64

tokenizers-0.20.0rc1-cp310-cp310-musllinux_1_1_aarch64.whl (9.0 MB view details)

Uploaded CPython 3.10 musllinux: musl 1.1+ ARM64

tokenizers-0.20.0rc1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (2.9 MB view details)

Uploaded CPython 3.10 manylinux: glibc 2.17+ x86-64

tokenizers-0.20.0rc1-cp310-cp310-manylinux_2_17_s390x.manylinux2014_s390x.whl (3.3 MB view details)

Uploaded CPython 3.10 manylinux: glibc 2.17+ s390x

tokenizers-0.20.0rc1-cp310-cp310-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl (3.0 MB view details)

Uploaded CPython 3.10 manylinux: glibc 2.17+ ppc64le

tokenizers-0.20.0rc1-cp310-cp310-manylinux_2_17_i686.manylinux2014_i686.whl (3.0 MB view details)

Uploaded CPython 3.10 manylinux: glibc 2.17+ i686

tokenizers-0.20.0rc1-cp310-cp310-manylinux_2_17_armv7l.manylinux2014_armv7l.whl (2.8 MB view details)

Uploaded CPython 3.10 manylinux: glibc 2.17+ ARMv7l

tokenizers-0.20.0rc1-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (2.9 MB view details)

Uploaded CPython 3.10 manylinux: glibc 2.17+ ARM64

tokenizers-0.20.0rc1-cp310-cp310-macosx_11_0_arm64.whl (2.5 MB view details)

Uploaded CPython 3.10 macOS 11.0+ ARM64

tokenizers-0.20.0rc1-cp310-cp310-macosx_10_12_x86_64.whl (2.6 MB view details)

Uploaded CPython 3.10 macOS 10.12+ x86-64

tokenizers-0.20.0rc1-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (2.9 MB view details)

Uploaded CPython 3.7m manylinux: glibc 2.17+ x86-64

tokenizers-0.20.0rc1-cp37-cp37m-manylinux_2_17_s390x.manylinux2014_s390x.whl (3.3 MB view details)

Uploaded CPython 3.7m manylinux: glibc 2.17+ s390x

tokenizers-0.20.0rc1-cp37-cp37m-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl (3.0 MB view details)

Uploaded CPython 3.7m manylinux: glibc 2.17+ ppc64le

tokenizers-0.20.0rc1-cp37-cp37m-manylinux_2_17_i686.manylinux2014_i686.whl (3.0 MB view details)

Uploaded CPython 3.7m manylinux: glibc 2.17+ i686

tokenizers-0.20.0rc1-cp37-cp37m-manylinux_2_17_armv7l.manylinux2014_armv7l.whl (2.8 MB view details)

Uploaded CPython 3.7m manylinux: glibc 2.17+ ARMv7l

tokenizers-0.20.0rc1-cp37-cp37m-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (2.9 MB view details)

Uploaded CPython 3.7m manylinux: glibc 2.17+ ARM64

tokenizers-0.20.0rc1-cp37-cp37m-macosx_11_0_arm64.whl (2.5 MB view details)

Uploaded CPython 3.7m macOS 11.0+ ARM64

tokenizers-0.20.0rc1-cp37-cp37m-macosx_10_12_x86_64.whl (2.6 MB view details)

Uploaded CPython 3.7m macOS 10.12+ x86-64

File details

Details for the file tokenizers-0.20.0rc1-cp312-none-win_amd64.whl.

File metadata

File hashes

Hashes for tokenizers-0.20.0rc1-cp312-none-win_amd64.whl
Algorithm Hash digest
SHA256 1e7712043ac370dd66f365f57a6d358650136d076c8b26071c9f45775003550c
MD5 70969a22d4816215181effb9ba5927fe
BLAKE2b-256 3a759b32547e988c0f54eaef65ee067b58638ba0d19909c89ea87e0e0d2cddf9

See more details on using hashes here.

File details

Details for the file tokenizers-0.20.0rc1-cp312-none-win32.whl.

File metadata

File hashes

Hashes for tokenizers-0.20.0rc1-cp312-none-win32.whl
Algorithm Hash digest
SHA256 b5b4fc8a002d75b135532da3584e3fb4752eaad95f465801a5332c919b8eafe3
MD5 aecf5af2677e8da9714c4a947173edc2
BLAKE2b-256 704785e570d37d6d5650acca9302b33cb62e473c4f6326156d7d30b9bc766cba

See more details on using hashes here.

File details

Details for the file tokenizers-0.20.0rc1-cp312-cp312-musllinux_1_1_x86_64.whl.

File metadata

File hashes

Hashes for tokenizers-0.20.0rc1-cp312-cp312-musllinux_1_1_x86_64.whl
Algorithm Hash digest
SHA256 dbd9c5b787ae07010e3b686131ddfd64fac778895578f09bd136cad92eb0c238
MD5 1998f27ceeaebcb49a8b5e50ad5014ff
BLAKE2b-256 90e5e15175992186fc0f5c9edcb21d6dd0fb5e0cb82b71f0f73b38aa98215f98

See more details on using hashes here.

File details

Details for the file tokenizers-0.20.0rc1-cp312-cp312-musllinux_1_1_aarch64.whl.

File metadata

File hashes

Hashes for tokenizers-0.20.0rc1-cp312-cp312-musllinux_1_1_aarch64.whl
Algorithm Hash digest
SHA256 db0c3c8194a6f6c6e5c5b99462330e1fd1f491017f897f035030aed3217aa3a1
MD5 57a530367cf67440d2bfad28cb5a37cf
BLAKE2b-256 5fc6782b689f9c9e1ebc80e4b7942eb455265ee28516268306d4cfa417120435

See more details on using hashes here.

File details

Details for the file tokenizers-0.20.0rc1-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for tokenizers-0.20.0rc1-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 5e434dcb8aac023cdf77e5b57de4ea7276353e5fd96420d948e41f7f2c871b0e
MD5 a4428095c71d7c29c3786d1317b1be84
BLAKE2b-256 37467964f004efd28a5daca6fb564f87732c03b466de2a7b15854c8ce4beaec1

See more details on using hashes here.

File details

Details for the file tokenizers-0.20.0rc1-cp312-cp312-manylinux_2_17_s390x.manylinux2014_s390x.whl.

File metadata

File hashes

Hashes for tokenizers-0.20.0rc1-cp312-cp312-manylinux_2_17_s390x.manylinux2014_s390x.whl
Algorithm Hash digest
SHA256 0c8c44fc0bc4ae5fa186fabdc31e81c962cff731a3163a91014d9326a098a615
MD5 7becdad7ef243fa021d9a274bdbed586
BLAKE2b-256 6c191bc903dad8446254a6e5a39d226471a566fe22f659a8166cc23faefc1a41

See more details on using hashes here.

File details

Details for the file tokenizers-0.20.0rc1-cp312-cp312-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl.

File metadata

File hashes

Hashes for tokenizers-0.20.0rc1-cp312-cp312-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl
Algorithm Hash digest
SHA256 160e9b4d2c40ab6529dae0b898310af44190be83b2845ec83245f476de2aa750
MD5 611bbec91e8f81db8adb6e9a5f197c2c
BLAKE2b-256 e4f3e05aa04fb3a00268c9876ad5a890c06654eb891c236afeaa89b76f153f20

See more details on using hashes here.

File details

Details for the file tokenizers-0.20.0rc1-cp312-cp312-manylinux_2_17_i686.manylinux2014_i686.whl.

File metadata

File hashes

Hashes for tokenizers-0.20.0rc1-cp312-cp312-manylinux_2_17_i686.manylinux2014_i686.whl
Algorithm Hash digest
SHA256 a8954d79fba6334f1df5898e8e6e3933ef0293f9bf9b483fd0168f1dca31cbfd
MD5 7098b8f8391aef230ec3bf6f4bceb033
BLAKE2b-256 a6443bbd58e8c60fdcf5524c0388ee57c6941c9d2beabeec02fae7d2541554f8

See more details on using hashes here.

File details

Details for the file tokenizers-0.20.0rc1-cp312-cp312-manylinux_2_17_armv7l.manylinux2014_armv7l.whl.

File metadata

File hashes

Hashes for tokenizers-0.20.0rc1-cp312-cp312-manylinux_2_17_armv7l.manylinux2014_armv7l.whl
Algorithm Hash digest
SHA256 113ec094df4044fd0693c5c0af565b4813a0017954d7fb8b9c05fb7beeadeacd
MD5 0f6c01bb62eadb7cf398cc0e9b587a8e
BLAKE2b-256 f354cc945cd057966f3f39d5c7d78814447313400e72bd769bf7f1e763c76dd3

See more details on using hashes here.

File details

Details for the file tokenizers-0.20.0rc1-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for tokenizers-0.20.0rc1-cp312-cp312-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 87166d03e15b154a2c7d6e49bad85a90845ac290cd484815c9153897d70e65fb
MD5 0d50401e6b1e2833cf2863624076f510
BLAKE2b-256 264ea10217fed47f368877ce90d54ef95f7acfaa9fce0985fda8be6285f3be07

See more details on using hashes here.

File details

Details for the file tokenizers-0.20.0rc1-cp312-cp312-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for tokenizers-0.20.0rc1-cp312-cp312-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 280dafe163396d63450a3c9f8e83c3374dde166d4fc05230be9bea81530be3dd
MD5 7307f193f4755fbb26841b50b2cbc031
BLAKE2b-256 87fea1921aa25616add8d04b2abb869f4a55f01672bd2e9e06b90eaa3aa7a907

See more details on using hashes here.

File details

Details for the file tokenizers-0.20.0rc1-cp312-cp312-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for tokenizers-0.20.0rc1-cp312-cp312-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 eabfd52ed5278f482b5222cd970426bb41b199a5c62edd43b9ab45140ec970ac
MD5 7a7ae5b581f0f49d1370cd1d1ffcd989
BLAKE2b-256 7972cbaa9cb3cc389d415aba0cd859dc373919ed827549fdab454daa6fe09ce7

See more details on using hashes here.

File details

Details for the file tokenizers-0.20.0rc1-cp311-none-win_amd64.whl.

File metadata

File hashes

Hashes for tokenizers-0.20.0rc1-cp311-none-win_amd64.whl
Algorithm Hash digest
SHA256 3d92267f4db6b35cc0ec9efe54bf0ccd5acc335d49ecef39fa46360ecc771821
MD5 1b3aa074601067814841d2a2ea60c6bc
BLAKE2b-256 2091061759b9268467d78a7be7f3443dbd1190e3cf60ce460d33bb327da49fe3

See more details on using hashes here.

File details

Details for the file tokenizers-0.20.0rc1-cp311-none-win32.whl.

File metadata

File hashes

Hashes for tokenizers-0.20.0rc1-cp311-none-win32.whl
Algorithm Hash digest
SHA256 50eae0a362db8ad85ba043cae93c91b466bf5956e3fd07e5957e1828a7709706
MD5 c24881bbd626a220182f9e1423ee4a93
BLAKE2b-256 d60df7c44f94e3a1b459f7d979e82473cbddf1f1470739e0fce420608f71669e

See more details on using hashes here.

File details

Details for the file tokenizers-0.20.0rc1-cp311-cp311-musllinux_1_1_x86_64.whl.

File metadata

File hashes

Hashes for tokenizers-0.20.0rc1-cp311-cp311-musllinux_1_1_x86_64.whl
Algorithm Hash digest
SHA256 f1a893185b6513b3a0cc4595bac92f7e51d69fb8e7ff88680260c6d3c477bc22
MD5 5b0443e5da3d42ae73ff13ece47ffc66
BLAKE2b-256 e3bd28db60221eaf291622bc58aa3f9c94d597f9c86e7a2541ba4b4f580d09cb

See more details on using hashes here.

File details

Details for the file tokenizers-0.20.0rc1-cp311-cp311-musllinux_1_1_aarch64.whl.

File metadata

File hashes

Hashes for tokenizers-0.20.0rc1-cp311-cp311-musllinux_1_1_aarch64.whl
Algorithm Hash digest
SHA256 a4c88113d8c6645eb9faa6d76190ad1d6a8a302610d46654676223e5999feb17
MD5 846f26978d6b8f3d8910e41d406fcf7d
BLAKE2b-256 9ff1cdc67b97a9538c92ae2080337d5d9b41b1bda8cbf634542c2064f92b010e

See more details on using hashes here.

File details

Details for the file tokenizers-0.20.0rc1-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for tokenizers-0.20.0rc1-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 b233a9be7dcc9f7a3f0793b5e9c31ff0554c60af6f15189b5497190415f87d99
MD5 7e1b0438eb00d26893a1c2a8d6fc540b
BLAKE2b-256 c8e8b9d3fa981417fc126efd107e289f1f902ffd163a6c8e4e6668f1d37778be

See more details on using hashes here.

File details

Details for the file tokenizers-0.20.0rc1-cp311-cp311-manylinux_2_17_s390x.manylinux2014_s390x.whl.

File metadata

File hashes

Hashes for tokenizers-0.20.0rc1-cp311-cp311-manylinux_2_17_s390x.manylinux2014_s390x.whl
Algorithm Hash digest
SHA256 70376d3773912d98d552b7c86ed5c290ae8ba49b7e2883d598a400446c37cac1
MD5 e5e6cb068b3f621157345b4344767224
BLAKE2b-256 9b489001ef3a91ab610ff0bd6f203792665c43197b37255180ae397a95514684

See more details on using hashes here.

File details

Details for the file tokenizers-0.20.0rc1-cp311-cp311-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl.

File metadata

File hashes

Hashes for tokenizers-0.20.0rc1-cp311-cp311-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl
Algorithm Hash digest
SHA256 1a3aa2e272887fce2238fb0f40cd27e702a10ebfcc037261183f8f94efc55ddf
MD5 74fef13fbfa8dac3d962bb8276e29f77
BLAKE2b-256 33720f9d986e9d8787b0896a916a728409806b12bdfd62c4dc3245c0e9bb6e5a

See more details on using hashes here.

File details

Details for the file tokenizers-0.20.0rc1-cp311-cp311-manylinux_2_17_i686.manylinux2014_i686.whl.

File metadata

File hashes

Hashes for tokenizers-0.20.0rc1-cp311-cp311-manylinux_2_17_i686.manylinux2014_i686.whl
Algorithm Hash digest
SHA256 30c1123cbccda9f8f61c2a09de8a10c05bac5e79e84bd95e278cdfb8a4f9c646
MD5 2b7a7507db6474b89d45dd630d768401
BLAKE2b-256 eca601fec17ee6795b7784e20f899d7ffd4bd62a8239b5b9c1347399a5bb0227

See more details on using hashes here.

File details

Details for the file tokenizers-0.20.0rc1-cp311-cp311-manylinux_2_17_armv7l.manylinux2014_armv7l.whl.

File metadata

File hashes

Hashes for tokenizers-0.20.0rc1-cp311-cp311-manylinux_2_17_armv7l.manylinux2014_armv7l.whl
Algorithm Hash digest
SHA256 64c8e34f5e453c3687bd8e2fc9d5d2adfbcd40bc18544c6230bbff256ec50426
MD5 92d2aa1c32d4a7daaf48a432603ed3fd
BLAKE2b-256 96230126c40204198dcf3cc8ee7c31c1c3d8ff49ad4444e08b8475a2ab6d7a44

See more details on using hashes here.

File details

Details for the file tokenizers-0.20.0rc1-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for tokenizers-0.20.0rc1-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 0fbbfdde9a07c20288566447b84fdc80cc15448cc4264807aadac45b4719649b
MD5 4b7df0bc1ef87572f53a7e834fb507cc
BLAKE2b-256 495949de348c67a90b4bbc10f6ea88cf88796a6d3d885f39ffee7b4b70965f9d

See more details on using hashes here.

File details

Details for the file tokenizers-0.20.0rc1-cp311-cp311-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for tokenizers-0.20.0rc1-cp311-cp311-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 3df4451433474c00ad8df5891601f4e85d6b7211fb1fa92e65951d5bba54ef77
MD5 e83a1aac83ab4e449d24e59588a603ce
BLAKE2b-256 4c6e5a198b877ae2c6aaa30666fe678148a51d3c2a8380fd101925d997ca70cb

See more details on using hashes here.

File details

Details for the file tokenizers-0.20.0rc1-cp311-cp311-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for tokenizers-0.20.0rc1-cp311-cp311-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 a133011361c1f1e5421221c1a34959c5682f60edeb875c2961374eda2e705f38
MD5 d20d9f36243a50f5d96e12d445eeeb0b
BLAKE2b-256 b9f4b0a6ba1df31b6fa34c980522717e7c3cb60f6c0e74a1e4c8a26e4eeeda56

See more details on using hashes here.

File details

Details for the file tokenizers-0.20.0rc1-cp310-none-win_amd64.whl.

File metadata

File hashes

Hashes for tokenizers-0.20.0rc1-cp310-none-win_amd64.whl
Algorithm Hash digest
SHA256 2fa4ded063a660afe5da6a4232c30626b6289fd9d0325355eef96d8773d000b9
MD5 783e562892776d5eba65da6c7e908de3
BLAKE2b-256 5191c0d7f34726fe6f4ec1a4c272818da45aae847ad05a3296af33a5bcf7f8fe

See more details on using hashes here.

File details

Details for the file tokenizers-0.20.0rc1-cp310-none-win32.whl.

File metadata

File hashes

Hashes for tokenizers-0.20.0rc1-cp310-none-win32.whl
Algorithm Hash digest
SHA256 0066603ed0eb1b361627cc20fa3ddbb957c876888ebd8a60e62e4eef039b017d
MD5 fb412519a2ea5d430eb8ac227f5e8582
BLAKE2b-256 a57ad2bceae34464e1ebffa99c75b12986feb2407effe9a19e6f421f7ba11279

See more details on using hashes here.

File details

Details for the file tokenizers-0.20.0rc1-cp310-cp310-musllinux_1_1_x86_64.whl.

File metadata

File hashes

Hashes for tokenizers-0.20.0rc1-cp310-cp310-musllinux_1_1_x86_64.whl
Algorithm Hash digest
SHA256 2a08c1408e95be328b8361b4b51a27d2a1e10d928aa6853f3cc4b1cd67aed5ff
MD5 41ed20033fd12f0e25ae7e620f6131d2
BLAKE2b-256 cd2a0f53e4d724a6c3df1078afc06e92e9c238d560f4ddd8b52f50ad27068b34

See more details on using hashes here.

File details

Details for the file tokenizers-0.20.0rc1-cp310-cp310-musllinux_1_1_aarch64.whl.

File metadata

File hashes

Hashes for tokenizers-0.20.0rc1-cp310-cp310-musllinux_1_1_aarch64.whl
Algorithm Hash digest
SHA256 3f6e6359d6b374c547084bee50dbb98415d24fe110b584fa7dbc974d85deafe3
MD5 3b0c7d8cefc2ee44a8950a21581268b9
BLAKE2b-256 bfacae8e6eac019b5ac95392c8e7836e92f401f0d8d52d48c5c13479efb5fd91

See more details on using hashes here.

File details

Details for the file tokenizers-0.20.0rc1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for tokenizers-0.20.0rc1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 e30c5645d91a2ce331ddfc7f459a71d9dab23720406ea68937a35d8f97655581
MD5 b69be6ce36631f8490dd66342b38a6bd
BLAKE2b-256 2b0aeed1c06c0fe87641cff3b6c5661f45e9440e97e022481046d97717dd757d

See more details on using hashes here.

File details

Details for the file tokenizers-0.20.0rc1-cp310-cp310-manylinux_2_17_s390x.manylinux2014_s390x.whl.

File metadata

File hashes

Hashes for tokenizers-0.20.0rc1-cp310-cp310-manylinux_2_17_s390x.manylinux2014_s390x.whl
Algorithm Hash digest
SHA256 960517b2a2d4310973666d87b45cbc62f7c359da50b1755ab91c2499cb3cd512
MD5 a351f3db6eb5b0d9d643822410e93f62
BLAKE2b-256 1d43b936a490f0fa4832808c4986c8a37888547f552bb8b17c08c2d23da0697b

See more details on using hashes here.

File details

Details for the file tokenizers-0.20.0rc1-cp310-cp310-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl.

File metadata

File hashes

Hashes for tokenizers-0.20.0rc1-cp310-cp310-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl
Algorithm Hash digest
SHA256 62d3efc01a99d1b8597d1fe9b25ae67cb7c46b057c340beedec6f20a9b47dc62
MD5 f5b8d5eb1fbf7074f0ac519408f0517c
BLAKE2b-256 a05a8f12de87fa33cec600afef80750dd644458a1e582f058f5c50e11c8fdc7c

See more details on using hashes here.

File details

Details for the file tokenizers-0.20.0rc1-cp310-cp310-manylinux_2_17_i686.manylinux2014_i686.whl.

File metadata

File hashes

Hashes for tokenizers-0.20.0rc1-cp310-cp310-manylinux_2_17_i686.manylinux2014_i686.whl
Algorithm Hash digest
SHA256 197d077ebd699490ff8248a52ca38a54493577483d79f159a37ab41d7b216cce
MD5 2a7c54f0fe28d4a5c040d8900b65b587
BLAKE2b-256 5f13af97826290a6db7aadb2b25a76669378ff0e135e9bd480608ece6559702c

See more details on using hashes here.

File details

Details for the file tokenizers-0.20.0rc1-cp310-cp310-manylinux_2_17_armv7l.manylinux2014_armv7l.whl.

File metadata

File hashes

Hashes for tokenizers-0.20.0rc1-cp310-cp310-manylinux_2_17_armv7l.manylinux2014_armv7l.whl
Algorithm Hash digest
SHA256 09ae9e4715e2485d1e5c8e181d3b78153dcf69ce61c584a996d09f3b67e7eb40
MD5 02f00d930519f21d8c5de1d39221d5bf
BLAKE2b-256 b8569b72a0e527e9904b396a8d952c81e66e5d03baafb2bfc55e5a79dda7877f

See more details on using hashes here.

File details

Details for the file tokenizers-0.20.0rc1-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for tokenizers-0.20.0rc1-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 4b7a6c8800aca1147b47043b7a03c9dc151fbdc9209266d8f8710709c899d92e
MD5 c5081b912af473c7de58b31255adc495
BLAKE2b-256 c24f5f3286c1e4eca161fd52219eebcaf8b60e39e669ea030d01ee74211db442

See more details on using hashes here.

File details

Details for the file tokenizers-0.20.0rc1-cp310-cp310-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for tokenizers-0.20.0rc1-cp310-cp310-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 1e3bc106647de12ad4ccce97af5051df377b223d104a22687c7f648062ba8658
MD5 28218a3cf6a2e62b03b64a1ebbcf1c92
BLAKE2b-256 68a45c589cad38fa4fabd88bf872a7cd6dc2608dab42c459799403d92e0848c3

See more details on using hashes here.

File details

Details for the file tokenizers-0.20.0rc1-cp310-cp310-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for tokenizers-0.20.0rc1-cp310-cp310-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 a2094d4da57d01075dab4cad2866334b01b8e08113d9856b7a07c0b72d3c2381
MD5 e09cf75b92e333976c527710aa320fe0
BLAKE2b-256 126b5d97f8a698177040a195fcc9919570a82be1aef19f9ce8ddfe83b1709755

See more details on using hashes here.

File details

Details for the file tokenizers-0.20.0rc1-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for tokenizers-0.20.0rc1-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 4fa89c479b56ffa491c644ce8588ba9ee917708efafbd5e8c2e51b1cd59c2c60
MD5 99732e4e4979d0f17981ae5642a2239a
BLAKE2b-256 c4d9fd9a7dc2c9675fdb5b294d3d2ecad71ebdfe6e662109601494b216bd3724

See more details on using hashes here.

File details

Details for the file tokenizers-0.20.0rc1-cp37-cp37m-manylinux_2_17_s390x.manylinux2014_s390x.whl.

File metadata

File hashes

Hashes for tokenizers-0.20.0rc1-cp37-cp37m-manylinux_2_17_s390x.manylinux2014_s390x.whl
Algorithm Hash digest
SHA256 b34819ef46ffaa7fe8f6799577b8357a9b1160e803736670c54d3bf3d6b33758
MD5 1818d85151155a7105fbfd2a436cda0d
BLAKE2b-256 a4d16330630a700703453a548d0fe11827a134bdbe91be09fb85876126ff64e9

See more details on using hashes here.

File details

Details for the file tokenizers-0.20.0rc1-cp37-cp37m-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl.

File metadata

File hashes

Hashes for tokenizers-0.20.0rc1-cp37-cp37m-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl
Algorithm Hash digest
SHA256 362dc2f720a2106d92e01d7c585888b7cca4f4aa9ed048c44dd01908c4286b81
MD5 ea3c952f50a650439e03444927d56ccd
BLAKE2b-256 8b55b0461f72f3d8e52d79f2df0ce84dd7aec3d6e15f01022301d843eab3aab2

See more details on using hashes here.

File details

Details for the file tokenizers-0.20.0rc1-cp37-cp37m-manylinux_2_17_i686.manylinux2014_i686.whl.

File metadata

File hashes

Hashes for tokenizers-0.20.0rc1-cp37-cp37m-manylinux_2_17_i686.manylinux2014_i686.whl
Algorithm Hash digest
SHA256 bf5e064196bfb1ae9f4ada6135847f2ed2590a52073484b11dc26f741f17b46f
MD5 9247c557ea2fe43692121c5547a90e48
BLAKE2b-256 13b6d41b6cfd1e037bd8545999dcaeb4838315c59a93663ffe43624408aca3af

See more details on using hashes here.

File details

Details for the file tokenizers-0.20.0rc1-cp37-cp37m-manylinux_2_17_armv7l.manylinux2014_armv7l.whl.

File metadata

File hashes

Hashes for tokenizers-0.20.0rc1-cp37-cp37m-manylinux_2_17_armv7l.manylinux2014_armv7l.whl
Algorithm Hash digest
SHA256 771cf5a3d613e75d2f6a851ad206c8ef32cda669594c00f0f35db995cf4ca42d
MD5 2d2f82100898ab65ed98e34fdaf333d6
BLAKE2b-256 395c57989feb28184ce7039c531ff2d47652faf019a3b029396aef9e7296379c

See more details on using hashes here.

File details

Details for the file tokenizers-0.20.0rc1-cp37-cp37m-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for tokenizers-0.20.0rc1-cp37-cp37m-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 11ddf9884d06aff281793ee9669a277fbd299bf89976e3d0ee792d011818a557
MD5 528133e15f4a2453f062fcca7f92e9ee
BLAKE2b-256 16a30dde9effc08a2c4dd18937d0f9bdd1aeb31eadcc031711ffcc5f902f38f3

See more details on using hashes here.

File details

Details for the file tokenizers-0.20.0rc1-cp37-cp37m-macosx_11_0_arm64.whl.

File metadata

File hashes

Hashes for tokenizers-0.20.0rc1-cp37-cp37m-macosx_11_0_arm64.whl
Algorithm Hash digest
SHA256 4a4d971ffed6a7a888945f36db4912cd76c4a1c6b24409001663728c281f0c32
MD5 803e15dbd1b1e6cf0e928d139fda0779
BLAKE2b-256 2baa2187254b042daa0a9dcba91e39233ddf6f9f17bbb1db33b2c60b6119eeaf

See more details on using hashes here.

File details

Details for the file tokenizers-0.20.0rc1-cp37-cp37m-macosx_10_12_x86_64.whl.

File metadata

File hashes

Hashes for tokenizers-0.20.0rc1-cp37-cp37m-macosx_10_12_x86_64.whl
Algorithm Hash digest
SHA256 632d3c8c213a7596e757e4fe44fce41bbbc89b54b982e6beb5d226e5fe2ecd86
MD5 9280ab382e3129eed4cc07b41c245e05
BLAKE2b-256 5955cd91020f809d590f4bf3bdcf6bcf3c00d9de9259be8187cc942382de17d0

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page