Skip to main content

Fast and Customizable Tokenizers

Project description



Build GitHub


Tokenizers

Provides an implementation of today's most used tokenizers, with a focus on performance and versatility.

Bindings over the Rust implementation. If you are interested in the High-level design, you can go check it there.

Otherwise, let's dive in!

Main features:

  • Train new vocabularies and tokenize using 4 pre-made tokenizers (Bert WordPiece and the 3 most common BPE versions).
  • Extremely fast (both training and tokenization), thanks to the Rust implementation. Takes less than 20 seconds to tokenize a GB of text on a server's CPU.
  • Easy to use, but also extremely versatile.
  • Designed for research and production.
  • Normalization comes with alignments tracking. It's always possible to get the part of the original sentence that corresponds to a given token.
  • Does all the pre-processing: Truncate, Pad, add the special tokens your model needs.

Installation

With pip:

pip install tokenizers

From sources:

To use this method, you need to have the Rust installed:

# Install with:
curl https://sh.rustup.rs -sSf | sh -s -- -y
export PATH="$HOME/.cargo/bin:$PATH"

Once Rust is installed, you can compile doing the following

git clone https://github.com/huggingface/tokenizers
cd tokenizers/bindings/python

# Create a virtual env (you can use yours as well)
python -m venv .env
source .env/bin/activate

# Install `tokenizers` in the current virtual env
pip install setuptools_rust
python setup.py install

Load a pretrained tokenizer from the Hub

from tokenizers import Tokenizer

tokenizer = Tokenizer.from_pretrained("bert-base-cased")

Using the provided Tokenizers

We provide some pre-build tokenizers to cover the most common cases. You can easily load one of these using some vocab.json and merges.txt files:

from tokenizers import CharBPETokenizer

# Initialize a tokenizer
vocab = "./path/to/vocab.json"
merges = "./path/to/merges.txt"
tokenizer = CharBPETokenizer(vocab, merges)

# And then encode:
encoded = tokenizer.encode("I can feel the magic, can you?")
print(encoded.ids)
print(encoded.tokens)

And you can train them just as simply:

from tokenizers import CharBPETokenizer

# Initialize a tokenizer
tokenizer = CharBPETokenizer()

# Then train it!
tokenizer.train([ "./path/to/files/1.txt", "./path/to/files/2.txt" ])

# Now, let's use it:
encoded = tokenizer.encode("I can feel the magic, can you?")

# And finally save it somewhere
tokenizer.save("./path/to/directory/my-bpe.tokenizer.json")

Provided Tokenizers

  • CharBPETokenizer: The original BPE
  • ByteLevelBPETokenizer: The byte level version of the BPE
  • SentencePieceBPETokenizer: A BPE implementation compatible with the one used by SentencePiece
  • BertWordPieceTokenizer: The famous Bert tokenizer, using WordPiece

All of these can be used and trained as explained above!

Build your own

Whenever these provided tokenizers don't give you enough freedom, you can build your own tokenizer, by putting all the different parts you need together. You can check how we implemented the provided tokenizers and adapt them easily to your own needs.

Building a byte-level BPE

Here is an example showing how to build your own byte-level BPE by putting all the different pieces together, and then saving it to a single file:

from tokenizers import Tokenizer, models, pre_tokenizers, decoders, trainers, processors

# Initialize a tokenizer
tokenizer = Tokenizer(models.BPE())

# Customize pre-tokenization and decoding
tokenizer.pre_tokenizer = pre_tokenizers.ByteLevel(add_prefix_space=True)
tokenizer.decoder = decoders.ByteLevel()
tokenizer.post_processor = processors.ByteLevel(trim_offsets=True)

# And then train
trainer = trainers.BpeTrainer(
    vocab_size=20000,
    min_frequency=2,
    initial_alphabet=pre_tokenizers.ByteLevel.alphabet()
)
tokenizer.train([
    "./path/to/dataset/1.txt",
    "./path/to/dataset/2.txt",
    "./path/to/dataset/3.txt"
], trainer=trainer)

# And Save it
tokenizer.save("byte-level-bpe.tokenizer.json", pretty=True)

Now, when you want to use this tokenizer, this is as simple as:

from tokenizers import Tokenizer

tokenizer = Tokenizer.from_file("byte-level-bpe.tokenizer.json")

encoded = tokenizer.encode("I can feel the magic, can you?")

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tokenizers-0.11.3.tar.gz (217.1 kB view details)

Uploaded Source

Built Distributions

tokenizers-0.11.3-cp310-cp310-win_amd64.whl (3.3 MB view details)

Uploaded CPython 3.10 Windows x86-64

tokenizers-0.11.3-cp310-cp310-win32.whl (3.0 MB view details)

Uploaded CPython 3.10 Windows x86

tokenizers-0.11.3-cp310-cp310-macosx_10_11_x86_64.whl (3.7 MB view details)

Uploaded CPython 3.10 macOS 10.11+ x86-64

tokenizers-0.11.3-cp39-cp39-win_amd64.whl (3.3 MB view details)

Uploaded CPython 3.9 Windows x86-64

tokenizers-0.11.3-cp39-cp39-win32.whl (3.0 MB view details)

Uploaded CPython 3.9 Windows x86

tokenizers-0.11.3-cp39-cp39-manylinux_2_17_s390x.manylinux2014_s390x.whl (7.7 MB view details)

Uploaded CPython 3.9 manylinux: glibc 2.17+ s390x

tokenizers-0.11.3-cp39-cp39-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl (8.1 MB view details)

Uploaded CPython 3.9 manylinux: glibc 2.17+ ppc64le

tokenizers-0.11.3-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (7.0 MB view details)

Uploaded CPython 3.9 manylinux: glibc 2.17+ ARM64

tokenizers-0.11.3-cp39-cp39-manylinux_2_12_x86_64.manylinux2010_x86_64.whl (6.8 MB view details)

Uploaded CPython 3.9 manylinux: glibc 2.12+ x86-64

tokenizers-0.11.3-cp39-cp39-macosx_10_11_x86_64.whl (3.7 MB view details)

Uploaded CPython 3.9 macOS 10.11+ x86-64

tokenizers-0.11.3-cp38-cp38-win_amd64.whl (3.3 MB view details)

Uploaded CPython 3.8 Windows x86-64

tokenizers-0.11.3-cp38-cp38-win32.whl (3.0 MB view details)

Uploaded CPython 3.8 Windows x86

tokenizers-0.11.3-cp38-cp38-manylinux_2_17_s390x.manylinux2014_s390x.whl (7.7 MB view details)

Uploaded CPython 3.8 manylinux: glibc 2.17+ s390x

tokenizers-0.11.3-cp38-cp38-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl (8.1 MB view details)

Uploaded CPython 3.8 manylinux: glibc 2.17+ ppc64le

tokenizers-0.11.3-cp38-cp38-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (7.0 MB view details)

Uploaded CPython 3.8 manylinux: glibc 2.17+ ARM64

tokenizers-0.11.3-cp38-cp38-manylinux_2_12_x86_64.manylinux2010_x86_64.whl (6.8 MB view details)

Uploaded CPython 3.8 manylinux: glibc 2.12+ x86-64

tokenizers-0.11.3-cp38-cp38-macosx_10_11_x86_64.whl (3.7 MB view details)

Uploaded CPython 3.8 macOS 10.11+ x86-64

tokenizers-0.11.3-cp37-cp37m-win_amd64.whl (3.3 MB view details)

Uploaded CPython 3.7m Windows x86-64

tokenizers-0.11.3-cp37-cp37m-win32.whl (3.0 MB view details)

Uploaded CPython 3.7m Windows x86

tokenizers-0.11.3-cp37-cp37m-manylinux_2_17_s390x.manylinux2014_s390x.whl (7.7 MB view details)

Uploaded CPython 3.7m manylinux: glibc 2.17+ s390x

tokenizers-0.11.3-cp37-cp37m-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl (8.1 MB view details)

Uploaded CPython 3.7m manylinux: glibc 2.17+ ppc64le

tokenizers-0.11.3-cp37-cp37m-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (7.0 MB view details)

Uploaded CPython 3.7m manylinux: glibc 2.17+ ARM64

tokenizers-0.11.3-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl (6.8 MB view details)

Uploaded CPython 3.7m manylinux: glibc 2.12+ x86-64

tokenizers-0.11.3-cp37-cp37m-macosx_10_11_x86_64.whl (3.7 MB view details)

Uploaded CPython 3.7m macOS 10.11+ x86-64

tokenizers-0.11.3-cp36-cp36m-manylinux_2_17_s390x.manylinux2014_s390x.whl (7.7 MB view details)

Uploaded CPython 3.6m manylinux: glibc 2.17+ s390x

tokenizers-0.11.3-cp36-cp36m-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl (8.1 MB view details)

Uploaded CPython 3.6m manylinux: glibc 2.17+ ppc64le

tokenizers-0.11.3-cp36-cp36m-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (7.0 MB view details)

Uploaded CPython 3.6m manylinux: glibc 2.17+ ARM64

tokenizers-0.11.3-cp36-cp36m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl (6.8 MB view details)

Uploaded CPython 3.6m manylinux: glibc 2.12+ x86-64

File details

Details for the file tokenizers-0.11.3.tar.gz.

File metadata

  • Download URL: tokenizers-0.11.3.tar.gz
  • Upload date:
  • Size: 217.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.7.1 importlib_metadata/4.10.1 pkginfo/1.8.2 requests/2.27.1 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.10.1

File hashes

Hashes for tokenizers-0.11.3.tar.gz
Algorithm Hash digest
SHA256 b24943a72385825595f9f6bd203c2b9bf8471d355631d58ac5efa2333250a2fa
MD5 9a2644ff23d7f073bdb7bc67c3df2cb0
BLAKE2b-256 ac76177bce4dbb1b803e435821726a3ad9f39d85b7181e12bf01d8b195a0d5fc

See more details on using hashes here.

File details

Details for the file tokenizers-0.11.3-cp310-cp310-win_amd64.whl.

File metadata

  • Download URL: tokenizers-0.11.3-cp310-cp310-win_amd64.whl
  • Upload date:
  • Size: 3.3 MB
  • Tags: CPython 3.10, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.7.1 importlib_metadata/4.10.1 pkginfo/1.8.2 requests/2.27.1 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.10.1

File hashes

Hashes for tokenizers-0.11.3-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 e53cc7d7c78fe0ffb8cf9c39cfdccc5fcf34eedc27f3dd45102af073de2d4c18
MD5 91c647326ca0b90e14aa31eb39ad4d69
BLAKE2b-256 c5751fa226aa4a459a8b500e911f4ec82f4690d2f16642c09593377113c13d67

See more details on using hashes here.

File details

Details for the file tokenizers-0.11.3-cp310-cp310-win32.whl.

File metadata

  • Download URL: tokenizers-0.11.3-cp310-cp310-win32.whl
  • Upload date:
  • Size: 3.0 MB
  • Tags: CPython 3.10, Windows x86
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.7.1 importlib_metadata/4.10.1 pkginfo/1.8.2 requests/2.27.1 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.10.1

File hashes

Hashes for tokenizers-0.11.3-cp310-cp310-win32.whl
Algorithm Hash digest
SHA256 9a26a63888b4744c6ba6f61948a23c711acdadf7c3319ea4936145d2fd6b86a8
MD5 08d454aad6047685d230dd24d593a358
BLAKE2b-256 5d072b2df67c7feb9d129dd92438f4dbe080f9091b12ff93b6e52faacd62902a

See more details on using hashes here.

File details

Details for the file tokenizers-0.11.3-cp310-cp310-macosx_10_11_x86_64.whl.

File metadata

  • Download URL: tokenizers-0.11.3-cp310-cp310-macosx_10_11_x86_64.whl
  • Upload date:
  • Size: 3.7 MB
  • Tags: CPython 3.10, macOS 10.11+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.7.1 importlib_metadata/4.10.1 pkginfo/1.8.2 requests/2.27.1 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.10.1

File hashes

Hashes for tokenizers-0.11.3-cp310-cp310-macosx_10_11_x86_64.whl
Algorithm Hash digest
SHA256 4727d3f7dfa9853a04c4bfa31db12598f9a412014a1650b26f6918db94a6bb16
MD5 49b64b2713458219b0f7e1eb6cddbc69
BLAKE2b-256 ca1015ff5ac1b1b879ee4ac73abca4dfd3762b7c0e03d6c252649201c1464036

See more details on using hashes here.

File details

Details for the file tokenizers-0.11.3-cp39-cp39-win_amd64.whl.

File metadata

  • Download URL: tokenizers-0.11.3-cp39-cp39-win_amd64.whl
  • Upload date:
  • Size: 3.3 MB
  • Tags: CPython 3.9, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.7.1 importlib_metadata/4.10.1 pkginfo/1.8.2 requests/2.27.1 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.10.1

File hashes

Hashes for tokenizers-0.11.3-cp39-cp39-win_amd64.whl
Algorithm Hash digest
SHA256 9d71265f7d8bb02360c4f44a86c9985a1798a0fd3002a03feba708e5f2d81c3d
MD5 9db4be36b434fde8898c6eac0540a9f7
BLAKE2b-256 3c9969bc9c8539b00b9edd542bb58062aae52769c5baad8447a725b1aaf4498e

See more details on using hashes here.

File details

Details for the file tokenizers-0.11.3-cp39-cp39-win32.whl.

File metadata

  • Download URL: tokenizers-0.11.3-cp39-cp39-win32.whl
  • Upload date:
  • Size: 3.0 MB
  • Tags: CPython 3.9, Windows x86
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.7.1 importlib_metadata/4.10.1 pkginfo/1.8.2 requests/2.27.1 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.10.1

File hashes

Hashes for tokenizers-0.11.3-cp39-cp39-win32.whl
Algorithm Hash digest
SHA256 aa7be1619ed67f9368d9a590d6ddd5571796f0a805f6a69b07cbbdd1b9fbc755
MD5 86368b0853f01763da8ac730a560b973
BLAKE2b-256 d3f18f2a6f5dad2c608f029b2b044d91d39d7f3f429f0d7a4f545fddba569943

See more details on using hashes here.

File details

Details for the file tokenizers-0.11.3-cp39-cp39-manylinux_2_17_s390x.manylinux2014_s390x.whl.

File metadata

File hashes

Hashes for tokenizers-0.11.3-cp39-cp39-manylinux_2_17_s390x.manylinux2014_s390x.whl
Algorithm Hash digest
SHA256 785a8a16f8939fa438227458a2444d78c34cd4632934e266fb94dd5276ea1ece
MD5 616d9eb96d498324c963987f54558741
BLAKE2b-256 14b2ec078e982195ac7a7515092ea934a75868b22754f1b71d9ac525c6b82aff

See more details on using hashes here.

File details

Details for the file tokenizers-0.11.3-cp39-cp39-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl.

File metadata

File hashes

Hashes for tokenizers-0.11.3-cp39-cp39-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl
Algorithm Hash digest
SHA256 8fb0cfaba16e9baf6cf6fb1e3c6e7afc59cf4691226579f5a473f53bf48d145f
MD5 f2ca0533f08b2cd13800601164f893ae
BLAKE2b-256 a6c64a15ae0f84b430551e1567d8bc307e4ff0d2606efde0947649399830f5cf

See more details on using hashes here.

File details

Details for the file tokenizers-0.11.3-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for tokenizers-0.11.3-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 1e5b1ab3f6fbdf1687f247e7d2835e6f02b7b25278b82b5102013e23c746de28
MD5 f70f654bac35b2ab4fbbec84779e2338
BLAKE2b-256 744ca8d1f749370ae402190e4f10a5e8f7ed6ba43be8aeaf5c81ef17e1893909

See more details on using hashes here.

File details

Details for the file tokenizers-0.11.3-cp39-cp39-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tokenizers-0.11.3-cp39-cp39-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 58c969c6d0be72fa7b3da1e90ea9f1adffe47faa90caf50be449ec0951c55e2f
MD5 ceaa17d6dcbfb65904f7d85f87b9e104
BLAKE2b-256 0157180c804afb32707835b79d6a286dc429811471201b3517cccebfb27d20a2

See more details on using hashes here.

File details

Details for the file tokenizers-0.11.3-cp39-cp39-macosx_10_11_x86_64.whl.

File metadata

  • Download URL: tokenizers-0.11.3-cp39-cp39-macosx_10_11_x86_64.whl
  • Upload date:
  • Size: 3.7 MB
  • Tags: CPython 3.9, macOS 10.11+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.7.1 importlib_metadata/4.10.1 pkginfo/1.8.2 requests/2.27.1 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.10.1

File hashes

Hashes for tokenizers-0.11.3-cp39-cp39-macosx_10_11_x86_64.whl
Algorithm Hash digest
SHA256 b5f63b3915314838ac3fa76bcbcc931edd04884d5e41c30c2d0ed95ea1505d49
MD5 5980168468e936ee8c2cee07d539a49e
BLAKE2b-256 b828f6bc1928333b16a58eabe85e45452d2770c4be18dcedc284d023d0f59c9d

See more details on using hashes here.

File details

Details for the file tokenizers-0.11.3-cp38-cp38-win_amd64.whl.

File metadata

  • Download URL: tokenizers-0.11.3-cp38-cp38-win_amd64.whl
  • Upload date:
  • Size: 3.3 MB
  • Tags: CPython 3.8, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.7.1 importlib_metadata/4.10.1 pkginfo/1.8.2 requests/2.27.1 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.10.1

File hashes

Hashes for tokenizers-0.11.3-cp38-cp38-win_amd64.whl
Algorithm Hash digest
SHA256 ee4288960b98eef44d88843a47cd102dffc3ebd1d06e49ac0e798a7425a28cb1
MD5 519c0fd202c0a9f9693361593b62bc84
BLAKE2b-256 87570471f68331e4209c68e0d8aed5c8b2a13b92df2a98150ba4f23c0ae9051a

See more details on using hashes here.

File details

Details for the file tokenizers-0.11.3-cp38-cp38-win32.whl.

File metadata

  • Download URL: tokenizers-0.11.3-cp38-cp38-win32.whl
  • Upload date:
  • Size: 3.0 MB
  • Tags: CPython 3.8, Windows x86
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.7.1 importlib_metadata/4.10.1 pkginfo/1.8.2 requests/2.27.1 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.10.1

File hashes

Hashes for tokenizers-0.11.3-cp38-cp38-win32.whl
Algorithm Hash digest
SHA256 5511c03f4076d2a35cb7d110ed4aabe71ad05b480f4ec154ae5b411ac72f2e7d
MD5 229ed83afa07ee0c81f535e574460620
BLAKE2b-256 20640ee918822ae47f841265e00b605bdc6fc235c0b7e29c548d159fcecca4fe

See more details on using hashes here.

File details

Details for the file tokenizers-0.11.3-cp38-cp38-manylinux_2_17_s390x.manylinux2014_s390x.whl.

File metadata

File hashes

Hashes for tokenizers-0.11.3-cp38-cp38-manylinux_2_17_s390x.manylinux2014_s390x.whl
Algorithm Hash digest
SHA256 fbab7cd21ac59cbec73f854179e22dea42631239301f427c205b041ba5b6ea6c
MD5 f22347ba2b68cea00dfd52125047bc5d
BLAKE2b-256 df58e2ead8fc2c079444855a95dde3d94c2ff7313045d9eeef5e9eda69ed7b6b

See more details on using hashes here.

File details

Details for the file tokenizers-0.11.3-cp38-cp38-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl.

File metadata

File hashes

Hashes for tokenizers-0.11.3-cp38-cp38-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl
Algorithm Hash digest
SHA256 5499e5ee1d6e5ac99541f469f126c8b7379fb76cb4b2931baf9c2d507756edea
MD5 94199ecce84eaa3087842630f60c37f0
BLAKE2b-256 23dd4bfaddcf114a11ae79ba2029642caf15f6e93496b3b4c24596e30f79fbcc

See more details on using hashes here.

File details

Details for the file tokenizers-0.11.3-cp38-cp38-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for tokenizers-0.11.3-cp38-cp38-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 eedd2f25a7ab9492fbed9a7954305862737966c37b65770e08776c71a3bb6fa1
MD5 06e09c05b0fe360d0bfe299ffaea73b1
BLAKE2b-256 13dfcfeb727436a8b1fefc88d2fe81473dab6a1b4ef6215bc789f77f38a39cf7

See more details on using hashes here.

File details

Details for the file tokenizers-0.11.3-cp38-cp38-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tokenizers-0.11.3-cp38-cp38-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 651e7f4da78682a74af04535618cf3d95c719ae0cea25253f80037c15b3ef54c
MD5 1285e081358d9bd85eed0df25a486244
BLAKE2b-256 4dde90e92795bb56e99ff30d1b632767cb5270741414532c1854452a3279699f

See more details on using hashes here.

File details

Details for the file tokenizers-0.11.3-cp38-cp38-macosx_10_11_x86_64.whl.

File metadata

  • Download URL: tokenizers-0.11.3-cp38-cp38-macosx_10_11_x86_64.whl
  • Upload date:
  • Size: 3.7 MB
  • Tags: CPython 3.8, macOS 10.11+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.7.1 importlib_metadata/4.10.1 pkginfo/1.8.2 requests/2.27.1 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.10.1

File hashes

Hashes for tokenizers-0.11.3-cp38-cp38-macosx_10_11_x86_64.whl
Algorithm Hash digest
SHA256 b37aeb755e5e3ca9b4378609273bdf10622c985c4cd23f4e05ea4b216dbd0aea
MD5 8dc4b1ee93f0f353ac44ac9c19285a8c
BLAKE2b-256 cfd69ef1fb26e7126cd93a3ec21dbafa38f7cff37731bfd5f8308750e02745b0

See more details on using hashes here.

File details

Details for the file tokenizers-0.11.3-cp37-cp37m-win_amd64.whl.

File metadata

  • Download URL: tokenizers-0.11.3-cp37-cp37m-win_amd64.whl
  • Upload date:
  • Size: 3.3 MB
  • Tags: CPython 3.7m, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.7.1 importlib_metadata/4.10.1 pkginfo/1.8.2 requests/2.27.1 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.10.1

File hashes

Hashes for tokenizers-0.11.3-cp37-cp37m-win_amd64.whl
Algorithm Hash digest
SHA256 d23f6ed4fa0772ae29618cce5d70f4f14c21dc86f0e25e5febe65e18a8a8633f
MD5 187e92c4230f1d2021bd378cf1586219
BLAKE2b-256 02f6ead0c9f58af7d5306b7efc3f22877255e34e5c7e4dee18aa6454eaa4fb39

See more details on using hashes here.

File details

Details for the file tokenizers-0.11.3-cp37-cp37m-win32.whl.

File metadata

  • Download URL: tokenizers-0.11.3-cp37-cp37m-win32.whl
  • Upload date:
  • Size: 3.0 MB
  • Tags: CPython 3.7m, Windows x86
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.7.1 importlib_metadata/4.10.1 pkginfo/1.8.2 requests/2.27.1 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.10.1

File hashes

Hashes for tokenizers-0.11.3-cp37-cp37m-win32.whl
Algorithm Hash digest
SHA256 4bf856e5e31575aca74a3c80cce6d4762832f04ea57098c3b0de409b6f51c801
MD5 ba2e879494dcafb2fae8fbf74bb8e058
BLAKE2b-256 c715c9b2bef5826e8313eea463a27c56cbbc62ceded2e759dcad185453347012

See more details on using hashes here.

File details

Details for the file tokenizers-0.11.3-cp37-cp37m-manylinux_2_17_s390x.manylinux2014_s390x.whl.

File metadata

File hashes

Hashes for tokenizers-0.11.3-cp37-cp37m-manylinux_2_17_s390x.manylinux2014_s390x.whl
Algorithm Hash digest
SHA256 d29ffbb8b5e78e4e75e6b6d94d67cdb75620f96c625f5b18a9a8d13d081dbdf2
MD5 faa07582ec1678c80f395054e76aea25
BLAKE2b-256 8b27d9c5074f46b8d3261f1b594c63cca1d410e6226b19e8b44be2aef6a5361f

See more details on using hashes here.

File details

Details for the file tokenizers-0.11.3-cp37-cp37m-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl.

File metadata

File hashes

Hashes for tokenizers-0.11.3-cp37-cp37m-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl
Algorithm Hash digest
SHA256 6d5053e16325a2cdab979c58edaca5f285fb0c5bae8e8839e66b7895025024c8
MD5 099ee72e9698dab03fd424ce67da5833
BLAKE2b-256 878ce5a9be2c2c8954685d04870b8285fe1c407cc1c15bbf6e3db98007c444d9

See more details on using hashes here.

File details

Details for the file tokenizers-0.11.3-cp37-cp37m-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for tokenizers-0.11.3-cp37-cp37m-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 7c2ea52657e1c45f9f873f435b743ab03eec112d84751834c0b9455dc916a582
MD5 a4153ea20eacef0246f3209bcdbfcfd0
BLAKE2b-256 df32e079b8f4c12e73126afc1a1bcce7018a720fafed175fb96c5ce41f2e2af5

See more details on using hashes here.

File details

Details for the file tokenizers-0.11.3-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tokenizers-0.11.3-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 23052ab2c1722c49d184f8a83d4a76b917cdfac89e1e1227b6efa43c523ff92b
MD5 66e16abef825e9a8edcc87af1b6aa273
BLAKE2b-256 efd9d521b940e884a172e401bea9a48603179527871ccea0122099cb0f1509ca

See more details on using hashes here.

File details

Details for the file tokenizers-0.11.3-cp37-cp37m-macosx_10_11_x86_64.whl.

File metadata

  • Download URL: tokenizers-0.11.3-cp37-cp37m-macosx_10_11_x86_64.whl
  • Upload date:
  • Size: 3.7 MB
  • Tags: CPython 3.7m, macOS 10.11+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.7.1 importlib_metadata/4.10.1 pkginfo/1.8.2 requests/2.27.1 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.10.1

File hashes

Hashes for tokenizers-0.11.3-cp37-cp37m-macosx_10_11_x86_64.whl
Algorithm Hash digest
SHA256 163aaba23c2de3b50f5ce03d0708c87ec6f0ad89e938cee7ae587a89b5ddabaa
MD5 059937319940e275ff3cd4b45b4a1db5
BLAKE2b-256 0995b1c4ae6b49f73aced693d71ed96e210b69475bbf6a1bb93f1894f54bddff

See more details on using hashes here.

File details

Details for the file tokenizers-0.11.3-cp36-cp36m-manylinux_2_17_s390x.manylinux2014_s390x.whl.

File metadata

File hashes

Hashes for tokenizers-0.11.3-cp36-cp36m-manylinux_2_17_s390x.manylinux2014_s390x.whl
Algorithm Hash digest
SHA256 1b4a1e28b8bf5be9cac048ffa0f68814472341fae484a5e2d2bcf3ede9360c8b
MD5 ce2a14374a3da25f4ac005e30eb656f0
BLAKE2b-256 1ede878a819f329ae09a6d36fff409e95753b7deb4a20e72f97b1c3f6856000b

See more details on using hashes here.

File details

Details for the file tokenizers-0.11.3-cp36-cp36m-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl.

File metadata

File hashes

Hashes for tokenizers-0.11.3-cp36-cp36m-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl
Algorithm Hash digest
SHA256 eb0684fc0973eb8d1ac15dec188b0522e4feaf5253863c4e07a6c8d7f147a52e
MD5 01745b343c807aac7217209ed6ce12b0
BLAKE2b-256 3367b2c0104038dd05391e934b96a610896882dc2b1da6913e8242b143af25c8

See more details on using hashes here.

File details

Details for the file tokenizers-0.11.3-cp36-cp36m-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for tokenizers-0.11.3-cp36-cp36m-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 2abce6657308cfba6eb1812facfaef8f66b65e977dfc2cefba315f8059d0bbec
MD5 c0d98c9562b6f52dcf720c0939e932ca
BLAKE2b-256 8b3ed2eddc67ca521874a05f4ab32f8cc3d3a04404dd5dac3670b2ef54367468

See more details on using hashes here.

File details

Details for the file tokenizers-0.11.3-cp36-cp36m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tokenizers-0.11.3-cp36-cp36m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 297126201eefa645933d8f3b0954ee43d88a60a6747bcbe1e043011b3c851e32
MD5 56887f566586fcee8c6b901d77be3510
BLAKE2b-256 88c00d81350fd1100102a6aebd739f83c3458deff91ddce9263842a8281c7ea6

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page