Skip to main content

Fast and Customizable Tokenizers

Project description



Build GitHub


Tokenizers

Provides an implementation of today's most used tokenizers, with a focus on performance and versatility.

Bindings over the Rust implementation. If you are interested in the High-level design, you can go check it there.

Otherwise, let's dive in!

Main features:

  • Train new vocabularies and tokenize using 4 pre-made tokenizers (Bert WordPiece and the 3 most common BPE versions).
  • Extremely fast (both training and tokenization), thanks to the Rust implementation. Takes less than 20 seconds to tokenize a GB of text on a server's CPU.
  • Easy to use, but also extremely versatile.
  • Designed for research and production.
  • Normalization comes with alignments tracking. It's always possible to get the part of the original sentence that corresponds to a given token.
  • Does all the pre-processing: Truncate, Pad, add the special tokens your model needs.

Installation

With pip:

pip install tokenizers

From sources:

To use this method, you need to have the Rust installed:

# Install with:
curl https://sh.rustup.rs -sSf | sh -s -- -y
export PATH="$HOME/.cargo/bin:$PATH"

Once Rust is installed, you can compile doing the following

git clone https://github.com/huggingface/tokenizers
cd tokenizers/bindings/python

# Create a virtual env (you can use yours as well)
python -m venv .env
source .env/bin/activate

# Install `tokenizers` in the current virtual env
pip install setuptools_rust
python setup.py install

Load a pretrained tokenizer from the Hub

from tokenizers import Tokenizer

tokenizer = Tokenizer.from_pretrained("bert-base-cased")

Using the provided Tokenizers

We provide some pre-build tokenizers to cover the most common cases. You can easily load one of these using some vocab.json and merges.txt files:

from tokenizers import CharBPETokenizer

# Initialize a tokenizer
vocab = "./path/to/vocab.json"
merges = "./path/to/merges.txt"
tokenizer = CharBPETokenizer(vocab, merges)

# And then encode:
encoded = tokenizer.encode("I can feel the magic, can you?")
print(encoded.ids)
print(encoded.tokens)

And you can train them just as simply:

from tokenizers import CharBPETokenizer

# Initialize a tokenizer
tokenizer = CharBPETokenizer()

# Then train it!
tokenizer.train([ "./path/to/files/1.txt", "./path/to/files/2.txt" ])

# Now, let's use it:
encoded = tokenizer.encode("I can feel the magic, can you?")

# And finally save it somewhere
tokenizer.save("./path/to/directory/my-bpe.tokenizer.json")

Provided Tokenizers

  • CharBPETokenizer: The original BPE
  • ByteLevelBPETokenizer: The byte level version of the BPE
  • SentencePieceBPETokenizer: A BPE implementation compatible with the one used by SentencePiece
  • BertWordPieceTokenizer: The famous Bert tokenizer, using WordPiece

All of these can be used and trained as explained above!

Build your own

Whenever these provided tokenizers don't give you enough freedom, you can build your own tokenizer, by putting all the different parts you need together. You can check how we implemented the provided tokenizers and adapt them easily to your own needs.

Building a byte-level BPE

Here is an example showing how to build your own byte-level BPE by putting all the different pieces together, and then saving it to a single file:

from tokenizers import Tokenizer, models, pre_tokenizers, decoders, trainers, processors

# Initialize a tokenizer
tokenizer = Tokenizer(models.BPE())

# Customize pre-tokenization and decoding
tokenizer.pre_tokenizer = pre_tokenizers.ByteLevel(add_prefix_space=True)
tokenizer.decoder = decoders.ByteLevel()
tokenizer.post_processor = processors.ByteLevel(trim_offsets=True)

# And then train
trainer = trainers.BpeTrainer(
    vocab_size=20000,
    min_frequency=2,
    initial_alphabet=pre_tokenizers.ByteLevel.alphabet()
)
tokenizer.train([
    "./path/to/dataset/1.txt",
    "./path/to/dataset/2.txt",
    "./path/to/dataset/3.txt"
], trainer=trainer)

# And Save it
tokenizer.save("byte-level-bpe.tokenizer.json", pretty=True)

Now, when you want to use this tokenizer, this is as simple as:

from tokenizers import Tokenizer

tokenizer = Tokenizer.from_file("byte-level-bpe.tokenizer.json")

encoded = tokenizer.encode("I can feel the magic, can you?")

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tokenizers-0.11.0.tar.gz (216.3 kB view details)

Uploaded Source

Built Distributions

tokenizers-0.11.0-cp39-cp39-win_amd64.whl (3.3 MB view details)

Uploaded CPython 3.9 Windows x86-64

tokenizers-0.11.0-cp39-cp39-win32.whl (3.0 MB view details)

Uploaded CPython 3.9 Windows x86

tokenizers-0.11.0-cp39-cp39-manylinux_2_17_s390x.manylinux2014_s390x.whl (7.7 MB view details)

Uploaded CPython 3.9 manylinux: glibc 2.17+ s390x

tokenizers-0.11.0-cp39-cp39-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl (8.0 MB view details)

Uploaded CPython 3.9 manylinux: glibc 2.17+ ppc64le

tokenizers-0.11.0-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (7.0 MB view details)

Uploaded CPython 3.9 manylinux: glibc 2.17+ ARM64

tokenizers-0.11.0-cp39-cp39-manylinux_2_12_x86_64.manylinux2010_x86_64.whl (6.8 MB view details)

Uploaded CPython 3.9 manylinux: glibc 2.12+ x86-64

tokenizers-0.11.0-cp39-cp39-macosx_10_11_x86_64.whl (3.7 MB view details)

Uploaded CPython 3.9 macOS 10.11+ x86-64

tokenizers-0.11.0-cp38-cp38-win_amd64.whl (3.3 MB view details)

Uploaded CPython 3.8 Windows x86-64

tokenizers-0.11.0-cp38-cp38-win32.whl (3.0 MB view details)

Uploaded CPython 3.8 Windows x86

tokenizers-0.11.0-cp38-cp38-manylinux_2_17_s390x.manylinux2014_s390x.whl (7.7 MB view details)

Uploaded CPython 3.8 manylinux: glibc 2.17+ s390x

tokenizers-0.11.0-cp38-cp38-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl (8.0 MB view details)

Uploaded CPython 3.8 manylinux: glibc 2.17+ ppc64le

tokenizers-0.11.0-cp38-cp38-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (7.0 MB view details)

Uploaded CPython 3.8 manylinux: glibc 2.17+ ARM64

tokenizers-0.11.0-cp38-cp38-manylinux_2_12_x86_64.manylinux2010_x86_64.whl (6.8 MB view details)

Uploaded CPython 3.8 manylinux: glibc 2.12+ x86-64

tokenizers-0.11.0-cp38-cp38-macosx_10_11_x86_64.whl (3.7 MB view details)

Uploaded CPython 3.8 macOS 10.11+ x86-64

tokenizers-0.11.0-cp37-cp37m-win_amd64.whl (3.3 MB view details)

Uploaded CPython 3.7m Windows x86-64

tokenizers-0.11.0-cp37-cp37m-win32.whl (3.0 MB view details)

Uploaded CPython 3.7m Windows x86

tokenizers-0.11.0-cp37-cp37m-manylinux_2_17_s390x.manylinux2014_s390x.whl (7.7 MB view details)

Uploaded CPython 3.7m manylinux: glibc 2.17+ s390x

tokenizers-0.11.0-cp37-cp37m-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl (8.0 MB view details)

Uploaded CPython 3.7m manylinux: glibc 2.17+ ppc64le

tokenizers-0.11.0-cp37-cp37m-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (7.0 MB view details)

Uploaded CPython 3.7m manylinux: glibc 2.17+ ARM64

tokenizers-0.11.0-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl (6.8 MB view details)

Uploaded CPython 3.7m manylinux: glibc 2.12+ x86-64

tokenizers-0.11.0-cp37-cp37m-macosx_10_11_x86_64.whl (3.7 MB view details)

Uploaded CPython 3.7m macOS 10.11+ x86-64

tokenizers-0.11.0-cp36-cp36m-manylinux_2_17_s390x.manylinux2014_s390x.whl (7.7 MB view details)

Uploaded CPython 3.6m manylinux: glibc 2.17+ s390x

tokenizers-0.11.0-cp36-cp36m-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl (8.0 MB view details)

Uploaded CPython 3.6m manylinux: glibc 2.17+ ppc64le

tokenizers-0.11.0-cp36-cp36m-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (7.0 MB view details)

Uploaded CPython 3.6m manylinux: glibc 2.17+ ARM64

tokenizers-0.11.0-cp36-cp36m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl (6.8 MB view details)

Uploaded CPython 3.6m manylinux: glibc 2.12+ x86-64

File details

Details for the file tokenizers-0.11.0.tar.gz.

File metadata

  • Download URL: tokenizers-0.11.0.tar.gz
  • Upload date:
  • Size: 216.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.7.1 importlib_metadata/4.10.0 pkginfo/1.8.2 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.10.1

File hashes

Hashes for tokenizers-0.11.0.tar.gz
Algorithm Hash digest
SHA256 41ceac92b20a06f02c8585e515930a34d8dd948e55da75284e2f1da848d73b61
MD5 756333992524dc226f44f1040c15d9fd
BLAKE2b-256 9d7adf9c6acd74b51b35434fbf8fa2b78d6fea1b0ab2ccdf2846ac5d170e7183

See more details on using hashes here.

File details

Details for the file tokenizers-0.11.0-cp39-cp39-win_amd64.whl.

File metadata

  • Download URL: tokenizers-0.11.0-cp39-cp39-win_amd64.whl
  • Upload date:
  • Size: 3.3 MB
  • Tags: CPython 3.9, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.7.1 importlib_metadata/4.10.0 pkginfo/1.8.2 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.10.1

File hashes

Hashes for tokenizers-0.11.0-cp39-cp39-win_amd64.whl
Algorithm Hash digest
SHA256 d7c4c959d13ffd124d8b7f077f5e3176b1e89115967f1392d6942b8b0f34d7ba
MD5 626cbbb6cb8daadf8748246bbebcddb7
BLAKE2b-256 aed3fe4c02247fb86321d5bd51743a60e661c6d9ec1bb5829c79f6d42ff2d4f1

See more details on using hashes here.

File details

Details for the file tokenizers-0.11.0-cp39-cp39-win32.whl.

File metadata

  • Download URL: tokenizers-0.11.0-cp39-cp39-win32.whl
  • Upload date:
  • Size: 3.0 MB
  • Tags: CPython 3.9, Windows x86
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.7.1 importlib_metadata/4.10.0 pkginfo/1.8.2 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.10.1

File hashes

Hashes for tokenizers-0.11.0-cp39-cp39-win32.whl
Algorithm Hash digest
SHA256 75bb617c837f8947415e47ef34aa7f8a26da7a36827977edc55a08505280ea1d
MD5 565f1b6294fe38b17bd9dc0f9aa39faf
BLAKE2b-256 d811d0305164aae23d93bb57ad00936efa2c416e02064ef606456cc68aeea5cb

See more details on using hashes here.

File details

Details for the file tokenizers-0.11.0-cp39-cp39-manylinux_2_17_s390x.manylinux2014_s390x.whl.

File metadata

File hashes

Hashes for tokenizers-0.11.0-cp39-cp39-manylinux_2_17_s390x.manylinux2014_s390x.whl
Algorithm Hash digest
SHA256 c2d0b38cc9c01b96ed8da514d47bbf5ce7f25895ed667c25d460a4ce8bb5aaf0
MD5 1beceffc518b4a09515cd35335393a7a
BLAKE2b-256 d666518d3ca6e82e00ebfb54a967e799aba73beb8a332d8afc2a333bad825050

See more details on using hashes here.

File details

Details for the file tokenizers-0.11.0-cp39-cp39-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl.

File metadata

File hashes

Hashes for tokenizers-0.11.0-cp39-cp39-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl
Algorithm Hash digest
SHA256 48e8e28c523ffb96b772286c6b447905fc36818380b2aab00a3fe7bac1f75107
MD5 d7c64df045b3a73969976f86c8b598ae
BLAKE2b-256 28ee23736cb94d0e2a63ae635f168330584d350acdc914fb0d0e998fc9fc5ff9

See more details on using hashes here.

File details

Details for the file tokenizers-0.11.0-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for tokenizers-0.11.0-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 839b6fa602dc2db641887009370e67b9159f46acd5ec4197f46bf0fad1e55a64
MD5 ed8928b0fd57c88d10596f27d4680ec7
BLAKE2b-256 83e11395619d59f2ba6ef4e9ac9c5d1d6d60cad955fd9506f68a3e009047e756

See more details on using hashes here.

File details

Details for the file tokenizers-0.11.0-cp39-cp39-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tokenizers-0.11.0-cp39-cp39-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 161d0a54b81fd9bce125e124b89db807c13510764692c40e97cbc37c127fb85a
MD5 9ed97476b8232c3a3e6f8aec25237f94
BLAKE2b-256 f72832077f0f44fdc55b80db3857a63337f2b5092341fe04b90daf6b6c8787b4

See more details on using hashes here.

File details

Details for the file tokenizers-0.11.0-cp39-cp39-macosx_10_11_x86_64.whl.

File metadata

  • Download URL: tokenizers-0.11.0-cp39-cp39-macosx_10_11_x86_64.whl
  • Upload date:
  • Size: 3.7 MB
  • Tags: CPython 3.9, macOS 10.11+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.7.1 importlib_metadata/4.10.0 pkginfo/1.8.2 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.10.1

File hashes

Hashes for tokenizers-0.11.0-cp39-cp39-macosx_10_11_x86_64.whl
Algorithm Hash digest
SHA256 3969b2b80be3b8dfb6d18cff6aa557193d54954f4c038a95f4cb9390c92ccd68
MD5 228530c4f38f1b2e536b6127f0324899
BLAKE2b-256 e2e0d9926aa4e9b39e7d13538359aa5dfc8ab201d48c5e466800b03a6c1ca076

See more details on using hashes here.

File details

Details for the file tokenizers-0.11.0-cp38-cp38-win_amd64.whl.

File metadata

  • Download URL: tokenizers-0.11.0-cp38-cp38-win_amd64.whl
  • Upload date:
  • Size: 3.3 MB
  • Tags: CPython 3.8, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.7.1 importlib_metadata/4.10.0 pkginfo/1.8.2 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.10.1

File hashes

Hashes for tokenizers-0.11.0-cp38-cp38-win_amd64.whl
Algorithm Hash digest
SHA256 b7582065ab703bad6c702ce4cb23046e3f0a7d247079f2e8b98c8976d33bbd6d
MD5 8c16dc234a0ffe765dc2fe7d5ac23321
BLAKE2b-256 91175f20b75f0c17fe5b06dde131d426d10772ceaed5fa991e855240bd9683b7

See more details on using hashes here.

File details

Details for the file tokenizers-0.11.0-cp38-cp38-win32.whl.

File metadata

  • Download URL: tokenizers-0.11.0-cp38-cp38-win32.whl
  • Upload date:
  • Size: 3.0 MB
  • Tags: CPython 3.8, Windows x86
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.7.1 importlib_metadata/4.10.0 pkginfo/1.8.2 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.10.1

File hashes

Hashes for tokenizers-0.11.0-cp38-cp38-win32.whl
Algorithm Hash digest
SHA256 06ea7366c97e8900329ecd42f499f2f574c0653051d12aedd272577a3a6c7c3e
MD5 07726c03a04144b8890d318be7877423
BLAKE2b-256 082f2aabac38f3fb49a15b019b5c0237135971f756c0d56b3d503595fbdd0871

See more details on using hashes here.

File details

Details for the file tokenizers-0.11.0-cp38-cp38-manylinux_2_17_s390x.manylinux2014_s390x.whl.

File metadata

File hashes

Hashes for tokenizers-0.11.0-cp38-cp38-manylinux_2_17_s390x.manylinux2014_s390x.whl
Algorithm Hash digest
SHA256 b8db21487a034db00a892eae36a83ecc83572dd775b82be5b99ef533b9e89cca
MD5 d3f98bba81a351d601fde1f88823230e
BLAKE2b-256 e452f1b5b1fe2ca0d86391b403f66d446669e29b8fb8f9aa36df3fb9437f8b5a

See more details on using hashes here.

File details

Details for the file tokenizers-0.11.0-cp38-cp38-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl.

File metadata

File hashes

Hashes for tokenizers-0.11.0-cp38-cp38-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl
Algorithm Hash digest
SHA256 8f3ab5198c101b869c8ed6182bc9a028e5e3038fe65b23df89fc10e7a6583ae9
MD5 38ca6b00b07a9b54f94be46acad40977
BLAKE2b-256 6aebca227256f6200a14aa93dd411a4190ff670b77ba7034ce28c1271ba64ec1

See more details on using hashes here.

File details

Details for the file tokenizers-0.11.0-cp38-cp38-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for tokenizers-0.11.0-cp38-cp38-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 ea5069ee2f446818bdab4402de2ffa3d4c5d44bed3f7f1356246a5288b866fb3
MD5 80e2a3a09cd186dead94e77f86d76d7a
BLAKE2b-256 7ad0d8231ff1b755ed7e730340323b9d3df8fe2d68181c39c35e3d19c8337219

See more details on using hashes here.

File details

Details for the file tokenizers-0.11.0-cp38-cp38-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tokenizers-0.11.0-cp38-cp38-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 3d4d2af5536c1c2af4299d620f76094942f96c2217e01c3a284d9f27f9425096
MD5 1f929cdc21cddb06c5382ba5802394ac
BLAKE2b-256 829120b738cb0633734cd9038fad6edf442fbcd8c9dbdc219fb86cf244cf51c9

See more details on using hashes here.

File details

Details for the file tokenizers-0.11.0-cp38-cp38-macosx_10_11_x86_64.whl.

File metadata

  • Download URL: tokenizers-0.11.0-cp38-cp38-macosx_10_11_x86_64.whl
  • Upload date:
  • Size: 3.7 MB
  • Tags: CPython 3.8, macOS 10.11+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.7.1 importlib_metadata/4.10.0 pkginfo/1.8.2 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.10.1

File hashes

Hashes for tokenizers-0.11.0-cp38-cp38-macosx_10_11_x86_64.whl
Algorithm Hash digest
SHA256 fb1e0aa5b2d15535686298bee3bec52234c133a485c526c7c448bbe48ad3b293
MD5 0d372b268289c5fb44cdd6fc6f1cbccf
BLAKE2b-256 7ab09674f18148fb8ec111e887b01f9a12c4ed56534bf8371b21c926c7a26162

See more details on using hashes here.

File details

Details for the file tokenizers-0.11.0-cp37-cp37m-win_amd64.whl.

File metadata

  • Download URL: tokenizers-0.11.0-cp37-cp37m-win_amd64.whl
  • Upload date:
  • Size: 3.3 MB
  • Tags: CPython 3.7m, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.7.1 importlib_metadata/4.10.0 pkginfo/1.8.2 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.10.1

File hashes

Hashes for tokenizers-0.11.0-cp37-cp37m-win_amd64.whl
Algorithm Hash digest
SHA256 158e1e9d268760dff377bd0a3c9d243c666b6593d8ff8d5c7256f566046ea62f
MD5 e11d98474f7bee194815e3fc14419439
BLAKE2b-256 98035d692646ef5096ff768dd10a74c774517eacc048f5f236c10534958b59ea

See more details on using hashes here.

File details

Details for the file tokenizers-0.11.0-cp37-cp37m-win32.whl.

File metadata

  • Download URL: tokenizers-0.11.0-cp37-cp37m-win32.whl
  • Upload date:
  • Size: 3.0 MB
  • Tags: CPython 3.7m, Windows x86
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.7.1 importlib_metadata/4.10.0 pkginfo/1.8.2 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.10.1

File hashes

Hashes for tokenizers-0.11.0-cp37-cp37m-win32.whl
Algorithm Hash digest
SHA256 fb4f262e769efc94b48bc2baefa384cd4cd9caf259a2af4d9997f6276c4b7011
MD5 d74b01ae088bba9eff0ac3646d19dc8f
BLAKE2b-256 a3ac577d2ee818112b61d671669b8037fd058b4bd70f5eb3a3582174a25ce059

See more details on using hashes here.

File details

Details for the file tokenizers-0.11.0-cp37-cp37m-manylinux_2_17_s390x.manylinux2014_s390x.whl.

File metadata

File hashes

Hashes for tokenizers-0.11.0-cp37-cp37m-manylinux_2_17_s390x.manylinux2014_s390x.whl
Algorithm Hash digest
SHA256 d7dde557b9924597a90085425cf2e7585afd2e7304df79034ef3d0402906da2d
MD5 d3f7a981f432103146d6cd1be124bcfa
BLAKE2b-256 18e2a28a92588408e5e2a706f38ccebc653316f0766b1fa94ee223442f720fed

See more details on using hashes here.

File details

Details for the file tokenizers-0.11.0-cp37-cp37m-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl.

File metadata

File hashes

Hashes for tokenizers-0.11.0-cp37-cp37m-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl
Algorithm Hash digest
SHA256 6c36826cccb007e81592adbcc3b519af872fa8c5e413bbbb37a072e856a3b5fe
MD5 c1faaa956bc766fffb55af8c1d03a0c4
BLAKE2b-256 b36169ae031f688bc6b2470098c48c430c5007765082721614d6cbd6243570ef

See more details on using hashes here.

File details

Details for the file tokenizers-0.11.0-cp37-cp37m-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for tokenizers-0.11.0-cp37-cp37m-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 7b2b0cd69bd3b820f22d1e116ab342957cb79b0fb20b9b550393c827a1074a8c
MD5 926f593e62d6703502821d7c294be1de
BLAKE2b-256 287a9d18f82ea42d30f0387226e29e5929cf53c5b95f940d95a66ac78a6c5aba

See more details on using hashes here.

File details

Details for the file tokenizers-0.11.0-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tokenizers-0.11.0-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 68f09540726c38745260a18daa5eb125562a1fce1aed524eb624a6476cd38760
MD5 2080e86b86b92ea963b116ed07e32a5f
BLAKE2b-256 c32f5ac10679e988cf54e1b61199fad3c874a5667be3afb17354924fce117c69

See more details on using hashes here.

File details

Details for the file tokenizers-0.11.0-cp37-cp37m-macosx_10_11_x86_64.whl.

File metadata

  • Download URL: tokenizers-0.11.0-cp37-cp37m-macosx_10_11_x86_64.whl
  • Upload date:
  • Size: 3.7 MB
  • Tags: CPython 3.7m, macOS 10.11+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.7.1 importlib_metadata/4.10.0 pkginfo/1.8.2 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.10.1

File hashes

Hashes for tokenizers-0.11.0-cp37-cp37m-macosx_10_11_x86_64.whl
Algorithm Hash digest
SHA256 5184c26eb502a53fcfd23b161df2385d80d8d430a9ee5145c36a9b4dcc1945d4
MD5 3f755771a703c747d7f22ad075e7b15a
BLAKE2b-256 bd6932a4c6a28868b7de433e07029bf5427759ed58862d0b17d96c2a3f3afafa

See more details on using hashes here.

File details

Details for the file tokenizers-0.11.0-cp36-cp36m-manylinux_2_17_s390x.manylinux2014_s390x.whl.

File metadata

File hashes

Hashes for tokenizers-0.11.0-cp36-cp36m-manylinux_2_17_s390x.manylinux2014_s390x.whl
Algorithm Hash digest
SHA256 b217d8796de31330ff0e83d8a5776998f636df367a558ce7cc739aee8b497523
MD5 e35cbec9bf5f4d48c9f7ddb7e02e95f3
BLAKE2b-256 478472614712b5dff251cfadee4b49c60b564e588eac447504331d5fd9ce735d

See more details on using hashes here.

File details

Details for the file tokenizers-0.11.0-cp36-cp36m-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl.

File metadata

File hashes

Hashes for tokenizers-0.11.0-cp36-cp36m-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl
Algorithm Hash digest
SHA256 04dfdaa92458c9b7479b8617e81c73068a6545b79c5b2a548511081b06bde80d
MD5 4ea85e51d425afbda857c3de735332b2
BLAKE2b-256 98647e329faf37d4c4aca4ad2b720cac7fb62c5840d54225a7eb57be90815173

See more details on using hashes here.

File details

Details for the file tokenizers-0.11.0-cp36-cp36m-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for tokenizers-0.11.0-cp36-cp36m-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 31f304515f8130300a0906d8341b710a727e9d6dd8987903d9d2523558e93d6a
MD5 1a6fd403a10518277962b47bf55a1d6a
BLAKE2b-256 c4b407c8a633a6d903bf6b267b0c756595524573c002537aff0c0faf26484c1f

See more details on using hashes here.

File details

Details for the file tokenizers-0.11.0-cp36-cp36m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tokenizers-0.11.0-cp36-cp36m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 317e40b2a89a71e6adde365bbe862d2e58fc5a3f98ce825dcf5556535fe03fed
MD5 489b2bfd185b0ae85d7940760ae6843a
BLAKE2b-256 121ee34df4119167d80d3c37bbc7a6f22e3055cb238a0904dcc2db5d3dd260a6

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page