Skip to main content

Fast and Customizable Tokenizers

Project description



Build GitHub


Tokenizers

Provides an implementation of today's most used tokenizers, with a focus on performance and versatility.

Bindings over the Rust implementation. If you are interested in the High-level design, you can go check it there.

Otherwise, let's dive in!

Main features:

  • Train new vocabularies and tokenize using 4 pre-made tokenizers (Bert WordPiece and the 3 most common BPE versions).
  • Extremely fast (both training and tokenization), thanks to the Rust implementation. Takes less than 20 seconds to tokenize a GB of text on a server's CPU.
  • Easy to use, but also extremely versatile.
  • Designed for research and production.
  • Normalization comes with alignments tracking. It's always possible to get the part of the original sentence that corresponds to a given token.
  • Does all the pre-processing: Truncate, Pad, add the special tokens your model needs.

Installation

With pip:

pip install tokenizers

From sources:

To use this method, you need to have the Rust installed:

# Install with:
curl https://sh.rustup.rs -sSf | sh -s -- -y
export PATH="$HOME/.cargo/bin:$PATH"

Once Rust is installed, you can compile doing the following

git clone https://github.com/huggingface/tokenizers
cd tokenizers/bindings/python

# Create a virtual env (you can use yours as well)
python -m venv .env
source .env/bin/activate

# Install `tokenizers` in the current virtual env
pip install setuptools_rust
python setup.py install

Load a pretrained tokenizer from the Hub

from tokenizers import Tokenizer

tokenizer = Tokenizer.from_pretrained("bert-base-cased")

Using the provided Tokenizers

We provide some pre-build tokenizers to cover the most common cases. You can easily load one of these using some vocab.json and merges.txt files:

from tokenizers import CharBPETokenizer

# Initialize a tokenizer
vocab = "./path/to/vocab.json"
merges = "./path/to/merges.txt"
tokenizer = CharBPETokenizer(vocab, merges)

# And then encode:
encoded = tokenizer.encode("I can feel the magic, can you?")
print(encoded.ids)
print(encoded.tokens)

And you can train them just as simply:

from tokenizers import CharBPETokenizer

# Initialize a tokenizer
tokenizer = CharBPETokenizer()

# Then train it!
tokenizer.train([ "./path/to/files/1.txt", "./path/to/files/2.txt" ])

# Now, let's use it:
encoded = tokenizer.encode("I can feel the magic, can you?")

# And finally save it somewhere
tokenizer.save("./path/to/directory/my-bpe.tokenizer.json")

Provided Tokenizers

  • CharBPETokenizer: The original BPE
  • ByteLevelBPETokenizer: The byte level version of the BPE
  • SentencePieceBPETokenizer: A BPE implementation compatible with the one used by SentencePiece
  • BertWordPieceTokenizer: The famous Bert tokenizer, using WordPiece

All of these can be used and trained as explained above!

Build your own

Whenever these provided tokenizers don't give you enough freedom, you can build your own tokenizer, by putting all the different parts you need together. You can check how we implemented the provided tokenizers and adapt them easily to your own needs.

Building a byte-level BPE

Here is an example showing how to build your own byte-level BPE by putting all the different pieces together, and then saving it to a single file:

from tokenizers import Tokenizer, models, pre_tokenizers, decoders, trainers, processors

# Initialize a tokenizer
tokenizer = Tokenizer(models.BPE())

# Customize pre-tokenization and decoding
tokenizer.pre_tokenizer = pre_tokenizers.ByteLevel(add_prefix_space=True)
tokenizer.decoder = decoders.ByteLevel()
tokenizer.post_processor = processors.ByteLevel(trim_offsets=True)

# And then train
trainer = trainers.BpeTrainer(
    vocab_size=20000,
    min_frequency=2,
    initial_alphabet=pre_tokenizers.ByteLevel.alphabet()
)
tokenizer.train([
    "./path/to/dataset/1.txt",
    "./path/to/dataset/2.txt",
    "./path/to/dataset/3.txt"
], trainer=trainer)

# And Save it
tokenizer.save("byte-level-bpe.tokenizer.json", pretty=True)

Now, when you want to use this tokenizer, this is as simple as:

from tokenizers import Tokenizer

tokenizer = Tokenizer.from_file("byte-level-bpe.tokenizer.json")

encoded = tokenizer.encode("I can feel the magic, can you?")

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tokenizers-0.13.2.tar.gz (359.1 kB view details)

Uploaded Source

Built Distributions

tokenizers-0.13.2-cp311-cp311-win_amd64.whl (3.3 MB view details)

Uploaded CPython 3.11 Windows x86-64

tokenizers-0.13.2-cp311-cp311-win32.whl (3.0 MB view details)

Uploaded CPython 3.11 Windows x86

tokenizers-0.13.2-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (7.6 MB view details)

Uploaded CPython 3.11 manylinux: glibc 2.17+ x86-64

tokenizers-0.13.2-cp311-cp311-manylinux_2_17_s390x.manylinux2014_s390x.whl (7.9 MB view details)

Uploaded CPython 3.11 manylinux: glibc 2.17+ s390x

tokenizers-0.13.2-cp311-cp311-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl (8.4 MB view details)

Uploaded CPython 3.11 manylinux: glibc 2.17+ ppc64le

tokenizers-0.13.2-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (7.3 MB view details)

Uploaded CPython 3.11 manylinux: glibc 2.17+ ARM64

tokenizers-0.13.2-cp311-cp311-macosx_12_0_arm64.whl (3.7 MB view details)

Uploaded CPython 3.11 macOS 12.0+ ARM64

tokenizers-0.13.2-cp311-cp311-macosx_10_11_universal2.whl (3.8 MB view details)

Uploaded CPython 3.11 macOS 10.11+ universal2 (ARM64, x86-64)

tokenizers-0.13.2-cp310-cp310-win_amd64.whl (3.3 MB view details)

Uploaded CPython 3.10 Windows x86-64

tokenizers-0.13.2-cp310-cp310-win32.whl (3.0 MB view details)

Uploaded CPython 3.10 Windows x86

tokenizers-0.13.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (7.6 MB view details)

Uploaded CPython 3.10 manylinux: glibc 2.17+ x86-64

tokenizers-0.13.2-cp310-cp310-manylinux_2_17_s390x.manylinux2014_s390x.whl (7.9 MB view details)

Uploaded CPython 3.10 manylinux: glibc 2.17+ s390x

tokenizers-0.13.2-cp310-cp310-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl (8.4 MB view details)

Uploaded CPython 3.10 manylinux: glibc 2.17+ ppc64le

tokenizers-0.13.2-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (7.3 MB view details)

Uploaded CPython 3.10 manylinux: glibc 2.17+ ARM64

tokenizers-0.13.2-cp310-cp310-macosx_12_0_arm64.whl (3.7 MB view details)

Uploaded CPython 3.10 macOS 12.0+ ARM64

tokenizers-0.13.2-cp310-cp310-macosx_10_11_x86_64.whl (3.8 MB view details)

Uploaded CPython 3.10 macOS 10.11+ x86-64

tokenizers-0.13.2-cp39-cp39-win_amd64.whl (3.3 MB view details)

Uploaded CPython 3.9 Windows x86-64

tokenizers-0.13.2-cp39-cp39-win32.whl (3.0 MB view details)

Uploaded CPython 3.9 Windows x86

tokenizers-0.13.2-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (7.6 MB view details)

Uploaded CPython 3.9 manylinux: glibc 2.17+ x86-64

tokenizers-0.13.2-cp39-cp39-manylinux_2_17_s390x.manylinux2014_s390x.whl (7.9 MB view details)

Uploaded CPython 3.9 manylinux: glibc 2.17+ s390x

tokenizers-0.13.2-cp39-cp39-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl (8.4 MB view details)

Uploaded CPython 3.9 manylinux: glibc 2.17+ ppc64le

tokenizers-0.13.2-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (7.3 MB view details)

Uploaded CPython 3.9 manylinux: glibc 2.17+ ARM64

tokenizers-0.13.2-cp39-cp39-macosx_12_0_arm64.whl (3.7 MB view details)

Uploaded CPython 3.9 macOS 12.0+ ARM64

tokenizers-0.13.2-cp39-cp39-macosx_10_11_x86_64.whl (3.8 MB view details)

Uploaded CPython 3.9 macOS 10.11+ x86-64

tokenizers-0.13.2-cp38-cp38-win_amd64.whl (3.3 MB view details)

Uploaded CPython 3.8 Windows x86-64

tokenizers-0.13.2-cp38-cp38-win32.whl (3.0 MB view details)

Uploaded CPython 3.8 Windows x86

tokenizers-0.13.2-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (7.6 MB view details)

Uploaded CPython 3.8 manylinux: glibc 2.17+ x86-64

tokenizers-0.13.2-cp38-cp38-manylinux_2_17_s390x.manylinux2014_s390x.whl (7.9 MB view details)

Uploaded CPython 3.8 manylinux: glibc 2.17+ s390x

tokenizers-0.13.2-cp38-cp38-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl (8.4 MB view details)

Uploaded CPython 3.8 manylinux: glibc 2.17+ ppc64le

tokenizers-0.13.2-cp38-cp38-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (7.3 MB view details)

Uploaded CPython 3.8 manylinux: glibc 2.17+ ARM64

tokenizers-0.13.2-cp38-cp38-macosx_12_0_arm64.whl (3.8 MB view details)

Uploaded CPython 3.8 macOS 12.0+ ARM64

tokenizers-0.13.2-cp38-cp38-macosx_10_11_x86_64.whl (3.8 MB view details)

Uploaded CPython 3.8 macOS 10.11+ x86-64

tokenizers-0.13.2-cp37-cp37m-win_amd64.whl (3.3 MB view details)

Uploaded CPython 3.7m Windows x86-64

tokenizers-0.13.2-cp37-cp37m-win32.whl (3.0 MB view details)

Uploaded CPython 3.7m Windows x86

tokenizers-0.13.2-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (7.6 MB view details)

Uploaded CPython 3.7m manylinux: glibc 2.17+ x86-64

tokenizers-0.13.2-cp37-cp37m-manylinux_2_17_s390x.manylinux2014_s390x.whl (7.9 MB view details)

Uploaded CPython 3.7m manylinux: glibc 2.17+ s390x

tokenizers-0.13.2-cp37-cp37m-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl (8.4 MB view details)

Uploaded CPython 3.7m manylinux: glibc 2.17+ ppc64le

tokenizers-0.13.2-cp37-cp37m-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (7.3 MB view details)

Uploaded CPython 3.7m manylinux: glibc 2.17+ ARM64

tokenizers-0.13.2-cp37-cp37m-macosx_10_11_x86_64.whl (3.8 MB view details)

Uploaded CPython 3.7m macOS 10.11+ x86-64

File details

Details for the file tokenizers-0.13.2.tar.gz.

File metadata

  • Download URL: tokenizers-0.13.2.tar.gz
  • Upload date:
  • Size: 359.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.11.0

File hashes

Hashes for tokenizers-0.13.2.tar.gz
Algorithm Hash digest
SHA256 f9525375582fd1912ac3caa2f727d36c86ff8c0c6de45ae1aaff90f87f33b907
MD5 20829f192a48d1e09e4eb31a2e063303
BLAKE2b-256 4ad9af2821b5934ed871f716eb65fb3bd43e7bc70b99191ec08f20cfd642d0a1

See more details on using hashes here.

File details

Details for the file tokenizers-0.13.2-cp311-cp311-win_amd64.whl.

File metadata

File hashes

Hashes for tokenizers-0.13.2-cp311-cp311-win_amd64.whl
Algorithm Hash digest
SHA256 fa7ef7ee380b1f49211bbcfac8a006b1a3fa2fa4c7f4ee134ae384eb4ea5e453
MD5 88998f748ccb976a57f6f6aa6393d0f6
BLAKE2b-256 0b2c14ffa9228c09b4e485f7ef32dfa696b127fa199174b11f038c5c5f00cdb2

See more details on using hashes here.

File details

Details for the file tokenizers-0.13.2-cp311-cp311-win32.whl.

File metadata

File hashes

Hashes for tokenizers-0.13.2-cp311-cp311-win32.whl
Algorithm Hash digest
SHA256 93714958d4ebe5362d3de7a6bd73dc86c36b5af5941ebef6c325ac900fa58865
MD5 915841a4dfc6b0a8404542b10c994a44
BLAKE2b-256 7384a34f70d7b1e3e502628a3bc506c48d9feda1b7e2cc9ba129c3a0a45758fe

See more details on using hashes here.

File details

Details for the file tokenizers-0.13.2-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for tokenizers-0.13.2-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 7892325f9ca1cc5fca0333d5bfd96a19044ce9b092ce2df625652109a3de16b8
MD5 74a08391a9111cbc63cdbfc29b29ec47
BLAKE2b-256 9d634559700815b47706bce5b75bf926960d673147b00720b645cddb79499370

See more details on using hashes here.

File details

Details for the file tokenizers-0.13.2-cp311-cp311-manylinux_2_17_s390x.manylinux2014_s390x.whl.

File metadata

File hashes

Hashes for tokenizers-0.13.2-cp311-cp311-manylinux_2_17_s390x.manylinux2014_s390x.whl
Algorithm Hash digest
SHA256 41291d0160946084cbd53c8ec3d029df3dc2af2673d46b25ff1a7f31a9d55d51
MD5 03eed247068b3cfc95c5bfa5df4017a9
BLAKE2b-256 701bc322d13960f1e47b62af7b52e6e20fff9f9c8886d5fc4e56ea338782d34a

See more details on using hashes here.

File details

Details for the file tokenizers-0.13.2-cp311-cp311-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl.

File metadata

File hashes

Hashes for tokenizers-0.13.2-cp311-cp311-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl
Algorithm Hash digest
SHA256 3606528c07cda0566cff6cbfbda2b167f923661be595feac95701ffcdcbdbb21
MD5 9f583989940cb5f859012f959e1731c1
BLAKE2b-256 71df932c2757acf713a8f587dc7d44233b5ea66b019adcff80055f24f35ad496

See more details on using hashes here.

File details

Details for the file tokenizers-0.13.2-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for tokenizers-0.13.2-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 a689654fc745135cce4eea3b15e29c372c3e0b01717c6978b563de5c38af9811
MD5 db255ceba541b785f6cdb16548416cc1
BLAKE2b-256 a82460c2e9fd36a434633368e31ff5c76c2ddf643be0461dd6101a0d79265222

See more details on using hashes here.

File details

Details for the file tokenizers-0.13.2-cp311-cp311-macosx_12_0_arm64.whl.

File metadata

File hashes

Hashes for tokenizers-0.13.2-cp311-cp311-macosx_12_0_arm64.whl
Algorithm Hash digest
SHA256 d1b079c4c9332048fec4cb9c2055c2373c74fbb336716a5524c9a720206d787e
MD5 b07b202296dd19d11efaab8492ef4eaa
BLAKE2b-256 4decaab5818913fd8f944f7f9dcf2a2faecac663ace4e2cb58499fa1e7302481

See more details on using hashes here.

File details

Details for the file tokenizers-0.13.2-cp311-cp311-macosx_10_11_universal2.whl.

File metadata

File hashes

Hashes for tokenizers-0.13.2-cp311-cp311-macosx_10_11_universal2.whl
Algorithm Hash digest
SHA256 9eee037bb5aa14daeb56b4c39956164b2bebbe6ab4ca7779d88aa16b79bd4e17
MD5 e47655bd68fce61520a65b4216d858a3
BLAKE2b-256 9cf921489d35d3d6a918d76c6faa1e8ca16ce42449f8cbc22b1eef895dc92388

See more details on using hashes here.

File details

Details for the file tokenizers-0.13.2-cp310-cp310-win_amd64.whl.

File metadata

File hashes

Hashes for tokenizers-0.13.2-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 eda77de40a0262690c666134baf19ec5c4f5b8bde213055911d9f5a718c506e1
MD5 a7b75ff7aea8dcba2a9576f932c13ea6
BLAKE2b-256 f9acedbfeae9be672a39a67c61328a8ffecff5eb5623c788bbb6a8392e0d7cc5

See more details on using hashes here.

File details

Details for the file tokenizers-0.13.2-cp310-cp310-win32.whl.

File metadata

File hashes

Hashes for tokenizers-0.13.2-cp310-cp310-win32.whl
Algorithm Hash digest
SHA256 96cedf83864bcc15a3ffd088a6f81a8a8f55b8b188eabd7a7f2a4469477036df
MD5 571dd5b1a00b96e0de31e12e47f82f0f
BLAKE2b-256 250a940773b8f37912c24a6ad5e8c7f7184afbc189106be26976b963705c7f66

See more details on using hashes here.

File details

Details for the file tokenizers-0.13.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for tokenizers-0.13.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 47ef745dbf9f49281e900e9e72915356d69de3a4e4d8a475bda26bfdb5047736
MD5 d400d16630b86438445e58fe5dec82dd
BLAKE2b-256 e69ed23618ec1eb6ea0aa68e96394f097088881465d9d688fca08117ddf1f829

See more details on using hashes here.

File details

Details for the file tokenizers-0.13.2-cp310-cp310-manylinux_2_17_s390x.manylinux2014_s390x.whl.

File metadata

File hashes

Hashes for tokenizers-0.13.2-cp310-cp310-manylinux_2_17_s390x.manylinux2014_s390x.whl
Algorithm Hash digest
SHA256 238e879d1a0f4fddc2ce5b2d00f219125df08f8532e5f1f2ba9ad42f02b7da59
MD5 d8bfd2e3e4ec92c6dde3429a948a0e5a
BLAKE2b-256 1cba09885063e7fd8790901cfa66bac37cc4924f017b9e0389d9b74763504064

See more details on using hashes here.

File details

Details for the file tokenizers-0.13.2-cp310-cp310-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl.

File metadata

File hashes

Hashes for tokenizers-0.13.2-cp310-cp310-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl
Algorithm Hash digest
SHA256 b10db6e4b036c78212c6763cb56411566edcf2668c910baa1939afd50095ce48
MD5 3b84f0bc83794c11cdce3232e71b4cb6
BLAKE2b-256 d3f829300ea9665fe16daa698bf604b7c4763db8a0e312325bcab84cd9a48407

See more details on using hashes here.

File details

Details for the file tokenizers-0.13.2-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for tokenizers-0.13.2-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 16756e6ab264b162f99c0c0a8d3d521328f428b33374c5ee161c0ebec42bf3c0
MD5 27dc92503f72dcf3955fce2c2e502e36
BLAKE2b-256 6239b2258b56ba320feb6f1ef7eea4a6e5f0d302e58afe64e3fefa46e90a5a53

See more details on using hashes here.

File details

Details for the file tokenizers-0.13.2-cp310-cp310-macosx_12_0_arm64.whl.

File metadata

File hashes

Hashes for tokenizers-0.13.2-cp310-cp310-macosx_12_0_arm64.whl
Algorithm Hash digest
SHA256 bc6983282ee74d638a4b6d149e5dadd8bc7ff1d0d6de663d69f099e0c6bddbeb
MD5 f00b408bac09a968355cae05738c7485
BLAKE2b-256 1f117f41721f56ac5433379e6fd8b4a2529dfda73bc5d0d4d9d0414176616440

See more details on using hashes here.

File details

Details for the file tokenizers-0.13.2-cp310-cp310-macosx_10_11_x86_64.whl.

File metadata

File hashes

Hashes for tokenizers-0.13.2-cp310-cp310-macosx_10_11_x86_64.whl
Algorithm Hash digest
SHA256 a6f36b1b499233bb4443b5e57e20630c5e02fba61109632f5e00dab970440157
MD5 fd837db877b86c1687d1b950dd332b72
BLAKE2b-256 997ff0e7aac969258fd12277b91039e30cf1c40ec9aa74233eade751bc426c3c

See more details on using hashes here.

File details

Details for the file tokenizers-0.13.2-cp39-cp39-win_amd64.whl.

File metadata

File hashes

Hashes for tokenizers-0.13.2-cp39-cp39-win_amd64.whl
Algorithm Hash digest
SHA256 486d637b413fddada845a10a45c74825d73d3725da42ccd8796ccd7a1c07a024
MD5 9fd590c65b8f04b3f758d12644bff8ba
BLAKE2b-256 26da13b9de936c361e3459b6c3406ea14442869c86f8efd027613d79210adf8d

See more details on using hashes here.

File details

Details for the file tokenizers-0.13.2-cp39-cp39-win32.whl.

File metadata

  • Download URL: tokenizers-0.13.2-cp39-cp39-win32.whl
  • Upload date:
  • Size: 3.0 MB
  • Tags: CPython 3.9, Windows x86
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.11.0

File hashes

Hashes for tokenizers-0.13.2-cp39-cp39-win32.whl
Algorithm Hash digest
SHA256 79189e7f706c74dbc6b753450757af172240916d6a12ed4526af5cc6d3ceca26
MD5 5c97e409bea27ae48372f7c58beb263f
BLAKE2b-256 ebe03c7539539e64110b2fb143a49f6c0e8a4bfc76ca5c2a34d65f02d77f38ad

See more details on using hashes here.

File details

Details for the file tokenizers-0.13.2-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for tokenizers-0.13.2-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 b3e306b0941ad35087ae7083919a5c410a6b672be0343609d79a1171a364ce79
MD5 f970c906438d1f3ba85b44e151393cb3
BLAKE2b-256 556737f8beee4c6be9b2c68e59d515967b748aa97279fd9040405f8ea27814e9

See more details on using hashes here.

File details

Details for the file tokenizers-0.13.2-cp39-cp39-manylinux_2_17_s390x.manylinux2014_s390x.whl.

File metadata

File hashes

Hashes for tokenizers-0.13.2-cp39-cp39-manylinux_2_17_s390x.manylinux2014_s390x.whl
Algorithm Hash digest
SHA256 66c892d85385b202893ac6bc47b13390909e205280e5df89a41086cfec76fedb
MD5 c7b261e31352de6466c3ea8c1704864b
BLAKE2b-256 48383941768e4191ed05863f243bcf3e63026a0f6800d99c122e57cc8e7bb433

See more details on using hashes here.

File details

Details for the file tokenizers-0.13.2-cp39-cp39-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl.

File metadata

File hashes

Hashes for tokenizers-0.13.2-cp39-cp39-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl
Algorithm Hash digest
SHA256 c09f4fa620e879debdd1ec299bb81e3c961fd8f64f0e460e64df0818d29d845c
MD5 ba238b1403c5e6a534db3dca3d6e0fd7
BLAKE2b-256 fbe877a1943fd1074938f673e9595c6d0c3be67a2137e11804b1f3f625adbb48

See more details on using hashes here.

File details

Details for the file tokenizers-0.13.2-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for tokenizers-0.13.2-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 61507a9953f6e7dc3c972cbc57ba94c80c8f7f686fbc0876afe70ea2b8cc8b04
MD5 45f5139ec0e57f39de685fda9298124f
BLAKE2b-256 4adbcf967cff5f6167c26b81b4698f930931b62cc9b8ab90563ac8d876e452f7

See more details on using hashes here.

File details

Details for the file tokenizers-0.13.2-cp39-cp39-macosx_12_0_arm64.whl.

File metadata

File hashes

Hashes for tokenizers-0.13.2-cp39-cp39-macosx_12_0_arm64.whl
Algorithm Hash digest
SHA256 80a57501b61ec4f94fb7ce109e2b4a1a090352618efde87253b4ede6d458b605
MD5 23fca4f9aebc46a4202f571a0cdc997f
BLAKE2b-256 3204aa84b232b101954c7292ef763acb60bde43bc658cea109e9581c4678b977

See more details on using hashes here.

File details

Details for the file tokenizers-0.13.2-cp39-cp39-macosx_10_11_x86_64.whl.

File metadata

File hashes

Hashes for tokenizers-0.13.2-cp39-cp39-macosx_10_11_x86_64.whl
Algorithm Hash digest
SHA256 b47d6212e7dd05784d7330b3b1e5a170809fa30e2b333ca5c93fba1463dec2b7
MD5 3bdcc10d9fc01f417e19fef23b72a8e4
BLAKE2b-256 e005ee216e248dd63009d3b84bbd67d5d9016580a13da2b3a4f06520bfb8b65a

See more details on using hashes here.

File details

Details for the file tokenizers-0.13.2-cp38-cp38-win_amd64.whl.

File metadata

File hashes

Hashes for tokenizers-0.13.2-cp38-cp38-win_amd64.whl
Algorithm Hash digest
SHA256 0b4cb2c60c094f31ea652f6cf9f349aae815f9243b860610c29a69ed0d7a88f8
MD5 92788fc6081dfced19061403f1fb3ede
BLAKE2b-256 9493e459652043471e51337fb4837a973386ccd1ee6fc9104679ec4e5f8de11a

See more details on using hashes here.

File details

Details for the file tokenizers-0.13.2-cp38-cp38-win32.whl.

File metadata

  • Download URL: tokenizers-0.13.2-cp38-cp38-win32.whl
  • Upload date:
  • Size: 3.0 MB
  • Tags: CPython 3.8, Windows x86
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.11.0

File hashes

Hashes for tokenizers-0.13.2-cp38-cp38-win32.whl
Algorithm Hash digest
SHA256 efbf189fb9cf29bd29e98c0437bdb9809f9de686a1e6c10e0b954410e9ca2142
MD5 e73a58edd3440acb2b1284f033267c97
BLAKE2b-256 e7275cf97e2bb1b4af54b325387eefa1d742884c65e2b6212dd65f197e48b26f

See more details on using hashes here.

File details

Details for the file tokenizers-0.13.2-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for tokenizers-0.13.2-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 4d3bc9f7d7f4c1aa84bb6b8d642a60272c8a2c987669e9bb0ac26daf0c6a9fc8
MD5 9feadc29f71786f8d6bc3f40d742404e
BLAKE2b-256 fa33acfd230f5c3e7d19bfae949dca45c230fbf1d4d6f62a0b2365caac17c37a

See more details on using hashes here.

File details

Details for the file tokenizers-0.13.2-cp38-cp38-manylinux_2_17_s390x.manylinux2014_s390x.whl.

File metadata

File hashes

Hashes for tokenizers-0.13.2-cp38-cp38-manylinux_2_17_s390x.manylinux2014_s390x.whl
Algorithm Hash digest
SHA256 92f040c4d938ea64683526b45dfc81c580e3b35aaebe847e7eec374961231734
MD5 58fe96ab44b76513713586b39319f469
BLAKE2b-256 24ddda072dfb3fe6422255440fecb637e75b3b3047e0b09142742eb204007d33

See more details on using hashes here.

File details

Details for the file tokenizers-0.13.2-cp38-cp38-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl.

File metadata

File hashes

Hashes for tokenizers-0.13.2-cp38-cp38-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl
Algorithm Hash digest
SHA256 6969e5ea7ccb909ce7d6d4dfd009115dc72799b0362a2ea353267168667408c4
MD5 86fdea0d36f55888edda6d26525fe901
BLAKE2b-256 6b5991040de4ba24fd431c93b361fd9cbbe0a3d126c7414cadf9a2fb0eaf3771

See more details on using hashes here.

File details

Details for the file tokenizers-0.13.2-cp38-cp38-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for tokenizers-0.13.2-cp38-cp38-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 a51b93932daba12ed07060935978a6779593a59709deab04a0d10e6fd5c29e60
MD5 8aceffe2209d97f494926a720facdead
BLAKE2b-256 c00bf33817efdb0d277ae63c29aa00c22036576dae93a2da58582ddffac94ebe

See more details on using hashes here.

File details

Details for the file tokenizers-0.13.2-cp38-cp38-macosx_12_0_arm64.whl.

File metadata

File hashes

Hashes for tokenizers-0.13.2-cp38-cp38-macosx_12_0_arm64.whl
Algorithm Hash digest
SHA256 f44d59bafe3d61e8a56b9e0a963075187c0f0091023120b13fbe37a87936f171
MD5 b634a2a58bf18fdffc4aebbe147d423c
BLAKE2b-256 7a2ee3465f79a4080b65fd2b116c0c0bbeeb49c68391d7b490b23edf3a9fc4f1

See more details on using hashes here.

File details

Details for the file tokenizers-0.13.2-cp38-cp38-macosx_10_11_x86_64.whl.

File metadata

File hashes

Hashes for tokenizers-0.13.2-cp38-cp38-macosx_10_11_x86_64.whl
Algorithm Hash digest
SHA256 ce298605a833ac7f81b8062d3102a42dcd9fa890493e8f756112c346339fe5c5
MD5 40742978b95ec4d295dcf9d394b7aa7d
BLAKE2b-256 b841e17b38ee56777553d5b0400b5ff98fb18fdd1bb55b6e59e5adf96e67d4b5

See more details on using hashes here.

File details

Details for the file tokenizers-0.13.2-cp37-cp37m-win_amd64.whl.

File metadata

File hashes

Hashes for tokenizers-0.13.2-cp37-cp37m-win_amd64.whl
Algorithm Hash digest
SHA256 c82fb87b1cbfa984d8f05b2b3c3c73e428b216c1d4f0e286d0a3b27f521b32eb
MD5 575db78f2961001ddaa62c54181949fe
BLAKE2b-256 3085c62fd4483d973508df2b17766239f0a08134f5ae18d2d052a1e23a5aa393

See more details on using hashes here.

File details

Details for the file tokenizers-0.13.2-cp37-cp37m-win32.whl.

File metadata

  • Download URL: tokenizers-0.13.2-cp37-cp37m-win32.whl
  • Upload date:
  • Size: 3.0 MB
  • Tags: CPython 3.7m, Windows x86
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.11.0

File hashes

Hashes for tokenizers-0.13.2-cp37-cp37m-win32.whl
Algorithm Hash digest
SHA256 a537061ee18ba104b7f3daa735060c39db3a22c8a9595845c55b6c01d36c5e87
MD5 a010a5cc0236dcdb66232bd78dc58560
BLAKE2b-256 05ff92e9d7cc2ed04bdcb6b6b847256cadd2cb77f1b6814ea8ba4ab9ae1ad944

See more details on using hashes here.

File details

Details for the file tokenizers-0.13.2-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for tokenizers-0.13.2-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 3ba9baa76b5a3eefa78b6cc351315a216232fd727ee5e3ce0f7c6885d9fb531b
MD5 a86253a7566be75fe99fb32dfd256ce2
BLAKE2b-256 82667a476341defe398542f93d4188f85dd71c21933c2806cc669fcbcbe8f429

See more details on using hashes here.

File details

Details for the file tokenizers-0.13.2-cp37-cp37m-manylinux_2_17_s390x.manylinux2014_s390x.whl.

File metadata

File hashes

Hashes for tokenizers-0.13.2-cp37-cp37m-manylinux_2_17_s390x.manylinux2014_s390x.whl
Algorithm Hash digest
SHA256 0901a5c6538d2d2dc752c6b4bde7dab170fddce559ec75662cfad03b3187c8f6
MD5 a82228de5ea9db01bf3b063150494ac4
BLAKE2b-256 7cbd76d51c470157e11b7dd025d0fb412ce77e9f4b8e9df47ff2b9857f2a4bd3

See more details on using hashes here.

File details

Details for the file tokenizers-0.13.2-cp37-cp37m-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl.

File metadata

File hashes

Hashes for tokenizers-0.13.2-cp37-cp37m-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl
Algorithm Hash digest
SHA256 cac01fc0b868e4d0a3aa7c5c53396da0a0a63136e81475d32fcf5c348fcb2866
MD5 7a04e57931cf0b42918c4a0e71aa3826
BLAKE2b-256 3e2bc4ce0c1e4033787dfbdc7533636053bb5539c57af76e391c28c8d21287bc

See more details on using hashes here.

File details

Details for the file tokenizers-0.13.2-cp37-cp37m-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for tokenizers-0.13.2-cp37-cp37m-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 a739d4d973d422e1073989769723f3b6ad8b11e59e635a63de99aea4b2208188
MD5 0fee2faa18a0ff21d56b646c00f75383
BLAKE2b-256 e56a9d5881f09e9ed759fd34ea4487176905d9bbc0bddda6e089b7369a652dda

See more details on using hashes here.

File details

Details for the file tokenizers-0.13.2-cp37-cp37m-macosx_10_11_x86_64.whl.

File metadata

File hashes

Hashes for tokenizers-0.13.2-cp37-cp37m-macosx_10_11_x86_64.whl
Algorithm Hash digest
SHA256 da521bfa94df6a08a6254bb8214ea04854bb9044d61063ae2529361688b5440a
MD5 83a70f5d6eb771aec3172741f5ac0495
BLAKE2b-256 887618179e00960d8d74aa47926ada81fad516894d0bfe25f684550a266f08ef

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page