Skip to main content

Fast and Customizable Tokenizers

Project description



Build GitHub


Tokenizers

Provides an implementation of today's most used tokenizers, with a focus on performance and versatility.

Bindings over the Rust implementation. If you are interested in the High-level design, you can go check it there.

Otherwise, let's dive in!

Main features:

  • Train new vocabularies and tokenize using 4 pre-made tokenizers (Bert WordPiece and the 3 most common BPE versions).
  • Extremely fast (both training and tokenization), thanks to the Rust implementation. Takes less than 20 seconds to tokenize a GB of text on a server's CPU.
  • Easy to use, but also extremely versatile.
  • Designed for research and production.
  • Normalization comes with alignments tracking. It's always possible to get the part of the original sentence that corresponds to a given token.
  • Does all the pre-processing: Truncate, Pad, add the special tokens your model needs.

Installation

With pip:

pip install tokenizers

From sources:

To use this method, you need to have the Rust installed:

# Install with:
curl https://sh.rustup.rs -sSf | sh -s -- -y
export PATH="$HOME/.cargo/bin:$PATH"

Once Rust is installed, you can compile doing the following

git clone https://github.com/huggingface/tokenizers
cd tokenizers/bindings/python

# Create a virtual env (you can use yours as well)
python -m venv .env
source .env/bin/activate

# Install `tokenizers` in the current virtual env
pip install setuptools_rust
python setup.py install

Load a pretrained tokenizer from the Hub

from tokenizers import Tokenizer

tokenizer = Tokenizer.from_pretrained("bert-base-cased")

Using the provided Tokenizers

We provide some pre-build tokenizers to cover the most common cases. You can easily load one of these using some vocab.json and merges.txt files:

from tokenizers import CharBPETokenizer

# Initialize a tokenizer
vocab = "./path/to/vocab.json"
merges = "./path/to/merges.txt"
tokenizer = CharBPETokenizer(vocab, merges)

# And then encode:
encoded = tokenizer.encode("I can feel the magic, can you?")
print(encoded.ids)
print(encoded.tokens)

And you can train them just as simply:

from tokenizers import CharBPETokenizer

# Initialize a tokenizer
tokenizer = CharBPETokenizer()

# Then train it!
tokenizer.train([ "./path/to/files/1.txt", "./path/to/files/2.txt" ])

# Now, let's use it:
encoded = tokenizer.encode("I can feel the magic, can you?")

# And finally save it somewhere
tokenizer.save("./path/to/directory/my-bpe.tokenizer.json")

Provided Tokenizers

  • CharBPETokenizer: The original BPE
  • ByteLevelBPETokenizer: The byte level version of the BPE
  • SentencePieceBPETokenizer: A BPE implementation compatible with the one used by SentencePiece
  • BertWordPieceTokenizer: The famous Bert tokenizer, using WordPiece

All of these can be used and trained as explained above!

Build your own

Whenever these provided tokenizers don't give you enough freedom, you can build your own tokenizer, by putting all the different parts you need together. You can check how we implemented the provided tokenizers and adapt them easily to your own needs.

Building a byte-level BPE

Here is an example showing how to build your own byte-level BPE by putting all the different pieces together, and then saving it to a single file:

from tokenizers import Tokenizer, models, pre_tokenizers, decoders, trainers, processors

# Initialize a tokenizer
tokenizer = Tokenizer(models.BPE())

# Customize pre-tokenization and decoding
tokenizer.pre_tokenizer = pre_tokenizers.ByteLevel(add_prefix_space=True)
tokenizer.decoder = decoders.ByteLevel()
tokenizer.post_processor = processors.ByteLevel(trim_offsets=True)

# And then train
trainer = trainers.BpeTrainer(
    vocab_size=20000,
    min_frequency=2,
    initial_alphabet=pre_tokenizers.ByteLevel.alphabet()
)
tokenizer.train([
    "./path/to/dataset/1.txt",
    "./path/to/dataset/2.txt",
    "./path/to/dataset/3.txt"
], trainer=trainer)

# And Save it
tokenizer.save("byte-level-bpe.tokenizer.json", pretty=True)

Now, when you want to use this tokenizer, this is as simple as:

from tokenizers import Tokenizer

tokenizer = Tokenizer.from_file("byte-level-bpe.tokenizer.json")

encoded = tokenizer.encode("I can feel the magic, can you?")

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tokenizers-0.11.1.tar.gz (216.4 kB view details)

Uploaded Source

Built Distributions

tokenizers-0.11.1-cp39-cp39-win_amd64.whl (3.3 MB view details)

Uploaded CPython 3.9 Windows x86-64

tokenizers-0.11.1-cp39-cp39-win32.whl (3.0 MB view details)

Uploaded CPython 3.9 Windows x86

tokenizers-0.11.1-cp39-cp39-manylinux_2_17_s390x.manylinux2014_s390x.whl (7.7 MB view details)

Uploaded CPython 3.9 manylinux: glibc 2.17+ s390x

tokenizers-0.11.1-cp39-cp39-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl (8.0 MB view details)

Uploaded CPython 3.9 manylinux: glibc 2.17+ ppc64le

tokenizers-0.11.1-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (7.0 MB view details)

Uploaded CPython 3.9 manylinux: glibc 2.17+ ARM64

tokenizers-0.11.1-cp39-cp39-manylinux_2_12_x86_64.manylinux2010_x86_64.whl (6.8 MB view details)

Uploaded CPython 3.9 manylinux: glibc 2.12+ x86-64

tokenizers-0.11.1-cp39-cp39-macosx_10_11_x86_64.whl (3.7 MB view details)

Uploaded CPython 3.9 macOS 10.11+ x86-64

tokenizers-0.11.1-cp38-cp38-win_amd64.whl (3.3 MB view details)

Uploaded CPython 3.8 Windows x86-64

tokenizers-0.11.1-cp38-cp38-win32.whl (3.0 MB view details)

Uploaded CPython 3.8 Windows x86

tokenizers-0.11.1-cp38-cp38-manylinux_2_17_s390x.manylinux2014_s390x.whl (7.7 MB view details)

Uploaded CPython 3.8 manylinux: glibc 2.17+ s390x

tokenizers-0.11.1-cp38-cp38-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl (8.0 MB view details)

Uploaded CPython 3.8 manylinux: glibc 2.17+ ppc64le

tokenizers-0.11.1-cp38-cp38-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (7.0 MB view details)

Uploaded CPython 3.8 manylinux: glibc 2.17+ ARM64

tokenizers-0.11.1-cp38-cp38-manylinux_2_12_x86_64.manylinux2010_x86_64.whl (6.8 MB view details)

Uploaded CPython 3.8 manylinux: glibc 2.12+ x86-64

tokenizers-0.11.1-cp38-cp38-macosx_10_11_x86_64.whl (3.7 MB view details)

Uploaded CPython 3.8 macOS 10.11+ x86-64

tokenizers-0.11.1-cp37-cp37m-win_amd64.whl (3.3 MB view details)

Uploaded CPython 3.7m Windows x86-64

tokenizers-0.11.1-cp37-cp37m-win32.whl (3.0 MB view details)

Uploaded CPython 3.7m Windows x86

tokenizers-0.11.1-cp37-cp37m-manylinux_2_17_s390x.manylinux2014_s390x.whl (7.7 MB view details)

Uploaded CPython 3.7m manylinux: glibc 2.17+ s390x

tokenizers-0.11.1-cp37-cp37m-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl (8.0 MB view details)

Uploaded CPython 3.7m manylinux: glibc 2.17+ ppc64le

tokenizers-0.11.1-cp37-cp37m-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (6.9 MB view details)

Uploaded CPython 3.7m manylinux: glibc 2.17+ ARM64

tokenizers-0.11.1-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl (6.8 MB view details)

Uploaded CPython 3.7m manylinux: glibc 2.12+ x86-64

tokenizers-0.11.1-cp37-cp37m-macosx_10_11_x86_64.whl (3.7 MB view details)

Uploaded CPython 3.7m macOS 10.11+ x86-64

tokenizers-0.11.1-cp36-cp36m-manylinux_2_17_s390x.manylinux2014_s390x.whl (7.7 MB view details)

Uploaded CPython 3.6m manylinux: glibc 2.17+ s390x

tokenizers-0.11.1-cp36-cp36m-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl (8.0 MB view details)

Uploaded CPython 3.6m manylinux: glibc 2.17+ ppc64le

tokenizers-0.11.1-cp36-cp36m-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (7.0 MB view details)

Uploaded CPython 3.6m manylinux: glibc 2.17+ ARM64

tokenizers-0.11.1-cp36-cp36m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl (6.8 MB view details)

Uploaded CPython 3.6m manylinux: glibc 2.12+ x86-64

File details

Details for the file tokenizers-0.11.1.tar.gz.

File metadata

  • Download URL: tokenizers-0.11.1.tar.gz
  • Upload date:
  • Size: 216.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.7.1 importlib_metadata/4.10.0 pkginfo/1.8.2 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.10.1

File hashes

Hashes for tokenizers-0.11.1.tar.gz
Algorithm Hash digest
SHA256 4edd36132b03d8a0c56f37bbc5fce5f8ee3558c9afaedee165516c5757271951
MD5 8cad80d991ffa6e3d3e26fa482e07796
BLAKE2b-256 c089e7c38566546bb009834ba280b260c1ceabd3df74079c7eb9925568a28caf

See more details on using hashes here.

File details

Details for the file tokenizers-0.11.1-cp39-cp39-win_amd64.whl.

File metadata

  • Download URL: tokenizers-0.11.1-cp39-cp39-win_amd64.whl
  • Upload date:
  • Size: 3.3 MB
  • Tags: CPython 3.9, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.7.1 importlib_metadata/4.10.0 pkginfo/1.8.2 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.10.1

File hashes

Hashes for tokenizers-0.11.1-cp39-cp39-win_amd64.whl
Algorithm Hash digest
SHA256 619ac6f4a4208ebf22c0b6af7d41454f872a0bc2418118dcada76094852b660f
MD5 e297cd56b5131e351416d922edf121ed
BLAKE2b-256 4af28fcea8f4d6c42355aac580ac32b45f5146ea3826db911a3d6821222d5568

See more details on using hashes here.

File details

Details for the file tokenizers-0.11.1-cp39-cp39-win32.whl.

File metadata

  • Download URL: tokenizers-0.11.1-cp39-cp39-win32.whl
  • Upload date:
  • Size: 3.0 MB
  • Tags: CPython 3.9, Windows x86
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.7.1 importlib_metadata/4.10.0 pkginfo/1.8.2 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.10.1

File hashes

Hashes for tokenizers-0.11.1-cp39-cp39-win32.whl
Algorithm Hash digest
SHA256 0c0d65d53097dab924b887f7b9b9a8fd7db6c7355b301941718d80c934397bb2
MD5 5c062bbad281e07500e8edae273dfa75
BLAKE2b-256 865c6c0621f81bcf9a7d440ad6d723865b53e05a36ce60a408670f99a059290c

See more details on using hashes here.

File details

Details for the file tokenizers-0.11.1-cp39-cp39-manylinux_2_17_s390x.manylinux2014_s390x.whl.

File metadata

File hashes

Hashes for tokenizers-0.11.1-cp39-cp39-manylinux_2_17_s390x.manylinux2014_s390x.whl
Algorithm Hash digest
SHA256 933f27bb3bc442b3846d8c865768ce2704aa8e6baa2b2d19a9c5a2e7d9518496
MD5 06ff67e3083b1fdb72b4158c68742bd2
BLAKE2b-256 f3722e6aef18eb335326136ed3c6f32e112a4d9e98cad5d9f47d26979bfc8f56

See more details on using hashes here.

File details

Details for the file tokenizers-0.11.1-cp39-cp39-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl.

File metadata

File hashes

Hashes for tokenizers-0.11.1-cp39-cp39-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl
Algorithm Hash digest
SHA256 533d844bf6c00dd4b63cf282011dd80394f49c2feb54dd83a063d717084019c7
MD5 f3b14ebff8899f15e54e204942c0aaf3
BLAKE2b-256 1d6c1118ccb17cbeefd78898e48ef6599c634d53aa8fd9277a3aecae13219f18

See more details on using hashes here.

File details

Details for the file tokenizers-0.11.1-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for tokenizers-0.11.1-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 5c8b0857094b326f261e46dfd376568c97ea819acc42d17634c9896759ead7ed
MD5 d278312ccf5e4ddafcf1ee4201de1e16
BLAKE2b-256 43c57acf09ed6c956eb2457dcb7656969cd248c7c1866a6eb84f4f1a0241a95f

See more details on using hashes here.

File details

Details for the file tokenizers-0.11.1-cp39-cp39-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tokenizers-0.11.1-cp39-cp39-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 991a50d7b54968b545f2c783afeb2f656454120f853087b0262992b0caac901f
MD5 29ec45ca0569fff9e79d8fdcf5c80d59
BLAKE2b-256 97834ea9f1da5f5b462cab2cc59be11f2eb5592b7b414cf5d421a187c2550d3c

See more details on using hashes here.

File details

Details for the file tokenizers-0.11.1-cp39-cp39-macosx_10_11_x86_64.whl.

File metadata

  • Download URL: tokenizers-0.11.1-cp39-cp39-macosx_10_11_x86_64.whl
  • Upload date:
  • Size: 3.7 MB
  • Tags: CPython 3.9, macOS 10.11+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.7.1 importlib_metadata/4.10.0 pkginfo/1.8.2 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.10.1

File hashes

Hashes for tokenizers-0.11.1-cp39-cp39-macosx_10_11_x86_64.whl
Algorithm Hash digest
SHA256 b36799babee7cd8fc8b0fedc1dcb50bfa67d4664239cf8b5c4fccd817725fe10
MD5 0aab14fe343bd28e1b4b2c8728ee9650
BLAKE2b-256 96cbc892326bb55dfbc1555520ddcf052826c63d7c58ed05d21a34ab9262d8e0

See more details on using hashes here.

File details

Details for the file tokenizers-0.11.1-cp38-cp38-win_amd64.whl.

File metadata

  • Download URL: tokenizers-0.11.1-cp38-cp38-win_amd64.whl
  • Upload date:
  • Size: 3.3 MB
  • Tags: CPython 3.8, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.7.1 importlib_metadata/4.10.0 pkginfo/1.8.2 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.10.1

File hashes

Hashes for tokenizers-0.11.1-cp38-cp38-win_amd64.whl
Algorithm Hash digest
SHA256 61dea5bad8f4889794910e732beda0b252e16cef9c8f59e803c983172e4086a4
MD5 060f30e6d1e84b315b77390cc3fffceb
BLAKE2b-256 b113e3ed674de5fb989e2abd12a2af9a33058b50773b09a69c26c2d3c86a1f51

See more details on using hashes here.

File details

Details for the file tokenizers-0.11.1-cp38-cp38-win32.whl.

File metadata

  • Download URL: tokenizers-0.11.1-cp38-cp38-win32.whl
  • Upload date:
  • Size: 3.0 MB
  • Tags: CPython 3.8, Windows x86
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.7.1 importlib_metadata/4.10.0 pkginfo/1.8.2 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.10.1

File hashes

Hashes for tokenizers-0.11.1-cp38-cp38-win32.whl
Algorithm Hash digest
SHA256 f9e306a52e41581c2ee61c50956091e5aa8715ad0a12038f0047417dcf23c52c
MD5 50b45f739c163a74a404b38a46dce9f8
BLAKE2b-256 d3e3a7b6eb2b896400eef3d54ddb831777796b75d6504ae68752327b4c7a414f

See more details on using hashes here.

File details

Details for the file tokenizers-0.11.1-cp38-cp38-manylinux_2_17_s390x.manylinux2014_s390x.whl.

File metadata

File hashes

Hashes for tokenizers-0.11.1-cp38-cp38-manylinux_2_17_s390x.manylinux2014_s390x.whl
Algorithm Hash digest
SHA256 3a8b398e135dfdb8a3b54b8f87fbb9f44ae4bbaeb771d07ffb2f80d4be1205d6
MD5 abf319c0bfbeb53d256b4a586f8668c6
BLAKE2b-256 2d0c5e886b6569ba85a344f19c2c1323293b1727b5e7687ba6cd2949a345b345

See more details on using hashes here.

File details

Details for the file tokenizers-0.11.1-cp38-cp38-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl.

File metadata

File hashes

Hashes for tokenizers-0.11.1-cp38-cp38-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl
Algorithm Hash digest
SHA256 a588bb3a9553536a0d14ddc0179fb135e23b801f4cd30d6688665788e563c73c
MD5 cb3e9d1b3262ddba1e71d955d2027e81
BLAKE2b-256 49cb2564f546933d92fa3f504760a4a4206e93c9cfccaa08eb12d5dce1dcab50

See more details on using hashes here.

File details

Details for the file tokenizers-0.11.1-cp38-cp38-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for tokenizers-0.11.1-cp38-cp38-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 3f053281c094588019589579b6139b47d4294c9b1d585be0b9bf3985693ec202
MD5 900a1e42e2744e575fb582445ace3814
BLAKE2b-256 1ea7930fcc9ac7a3c9571e0c8e716a16d72338268ae1d6a52173bda7f0b2852d

See more details on using hashes here.

File details

Details for the file tokenizers-0.11.1-cp38-cp38-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tokenizers-0.11.1-cp38-cp38-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 c668d6da907ada0ad6670c359d5ca28011b9b2321fd94165a57c570e3934ba10
MD5 564adb33c7c93bb7d7903bb7d4869ca6
BLAKE2b-256 7c5f703a9964b6b4e79cdf31c3f4fefc4e09a6c0416cfaae5da76027cb9b8caf

See more details on using hashes here.

File details

Details for the file tokenizers-0.11.1-cp38-cp38-macosx_10_11_x86_64.whl.

File metadata

  • Download URL: tokenizers-0.11.1-cp38-cp38-macosx_10_11_x86_64.whl
  • Upload date:
  • Size: 3.7 MB
  • Tags: CPython 3.8, macOS 10.11+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.7.1 importlib_metadata/4.10.0 pkginfo/1.8.2 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.10.1

File hashes

Hashes for tokenizers-0.11.1-cp38-cp38-macosx_10_11_x86_64.whl
Algorithm Hash digest
SHA256 29d894e02dc13715975b71416318062fc32dd0a71707d021e2a69ba08f250961
MD5 0f97b73f793f65dfa30143494354d5be
BLAKE2b-256 225ee7abef66b4d8c81a7a378957460f04b399e506d380e33bc34a4628d7bdb3

See more details on using hashes here.

File details

Details for the file tokenizers-0.11.1-cp37-cp37m-win_amd64.whl.

File metadata

  • Download URL: tokenizers-0.11.1-cp37-cp37m-win_amd64.whl
  • Upload date:
  • Size: 3.3 MB
  • Tags: CPython 3.7m, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.7.1 importlib_metadata/4.10.0 pkginfo/1.8.2 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.10.1

File hashes

Hashes for tokenizers-0.11.1-cp37-cp37m-win_amd64.whl
Algorithm Hash digest
SHA256 d9a320f968d8abfd425d7c19bdfbacc1ab25a47d985e02305c7e2583bbd4096a
MD5 0d203e177c15bd6261e7deab2b31c273
BLAKE2b-256 c65108a9fb4151c08af61dedc705ced953ea785e3f2670dc4c7263c6207fb3aa

See more details on using hashes here.

File details

Details for the file tokenizers-0.11.1-cp37-cp37m-win32.whl.

File metadata

  • Download URL: tokenizers-0.11.1-cp37-cp37m-win32.whl
  • Upload date:
  • Size: 3.0 MB
  • Tags: CPython 3.7m, Windows x86
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.7.1 importlib_metadata/4.10.0 pkginfo/1.8.2 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.10.1

File hashes

Hashes for tokenizers-0.11.1-cp37-cp37m-win32.whl
Algorithm Hash digest
SHA256 993af4cf065f420f37d8449cc08b5877fcf484cb126338163d5a66058541b7b4
MD5 c611ed6c1408e50001a365a54d276bf6
BLAKE2b-256 5db1bb2fe4efad5d6902e4d8284a330f28ce7eb13af468f478a92336fce0a3e5

See more details on using hashes here.

File details

Details for the file tokenizers-0.11.1-cp37-cp37m-manylinux_2_17_s390x.manylinux2014_s390x.whl.

File metadata

File hashes

Hashes for tokenizers-0.11.1-cp37-cp37m-manylinux_2_17_s390x.manylinux2014_s390x.whl
Algorithm Hash digest
SHA256 40b4bdd749c925dc84676763f59a51939415e5f3a503d67cfa6149a74960b360
MD5 1a740fdd8c35e511842e7ac7d5f30c4c
BLAKE2b-256 1e06b9051506cf47ef4a82935b68101a131ff1c2e45417e426620fe6a9e704f5

See more details on using hashes here.

File details

Details for the file tokenizers-0.11.1-cp37-cp37m-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl.

File metadata

File hashes

Hashes for tokenizers-0.11.1-cp37-cp37m-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl
Algorithm Hash digest
SHA256 10d69a8033d0da8be2a64d25adc36a1d17ab63e609f9ce54c0bdc5bf141dd41f
MD5 58fe30cd5e86ea013525727d4615e496
BLAKE2b-256 ba6d49ed71d4b30783ad29708dbb25405ce1a7ba1f51db551c06b34ff9d8c2a2

See more details on using hashes here.

File details

Details for the file tokenizers-0.11.1-cp37-cp37m-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for tokenizers-0.11.1-cp37-cp37m-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 60aba0a25e3a9e6ab2b3cffd954beba474f96c090ce4fc9bb21537622495c526
MD5 34e9f0c817cfc5436202099d3a93b2bd
BLAKE2b-256 77bdf07a9c6f8cee60391a9a0d13b2847f0707a0ab01d6164d70c39bc2b519fe

See more details on using hashes here.

File details

Details for the file tokenizers-0.11.1-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tokenizers-0.11.1-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 2b5dd84c5282ee88e3227fd79e08bdaed55a6731844ae0d35bdc527bb47c03aa
MD5 3f5f40498017f992c41f474ac81f82f5
BLAKE2b-256 60172a4e6ed8ab83d758301a5f91deccf97408fffac8aae62603f1d056b3912b

See more details on using hashes here.

File details

Details for the file tokenizers-0.11.1-cp37-cp37m-macosx_10_11_x86_64.whl.

File metadata

  • Download URL: tokenizers-0.11.1-cp37-cp37m-macosx_10_11_x86_64.whl
  • Upload date:
  • Size: 3.7 MB
  • Tags: CPython 3.7m, macOS 10.11+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.7.1 importlib_metadata/4.10.0 pkginfo/1.8.2 requests/2.26.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.10.1

File hashes

Hashes for tokenizers-0.11.1-cp37-cp37m-macosx_10_11_x86_64.whl
Algorithm Hash digest
SHA256 4402fd9f9f3dde53bbc442f1a2d1e826c22cc1c29f3d311f90fc6a5ea17a18ff
MD5 15e8fd26e04fbe7c8123818a41dde62e
BLAKE2b-256 e253c63a231563ebe9bd0c4c5c37e02a2d53ff3af0cfffcc8377ee7d9b606088

See more details on using hashes here.

File details

Details for the file tokenizers-0.11.1-cp36-cp36m-manylinux_2_17_s390x.manylinux2014_s390x.whl.

File metadata

File hashes

Hashes for tokenizers-0.11.1-cp36-cp36m-manylinux_2_17_s390x.manylinux2014_s390x.whl
Algorithm Hash digest
SHA256 d4dc6a992a7d9c3295fdfd10ab1162856687f224b614752ccea4b0a7fd727344
MD5 5193c19f8984756aab66d92622fc7ace
BLAKE2b-256 8b768339f8479a654bb16c27331c67c9c0a52ad47df3a649400c5cc679b242ca

See more details on using hashes here.

File details

Details for the file tokenizers-0.11.1-cp36-cp36m-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl.

File metadata

File hashes

Hashes for tokenizers-0.11.1-cp36-cp36m-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl
Algorithm Hash digest
SHA256 26f8bc83b3ad8e5886503a2820958074f035fb67c7ea686132c53e04e2705a40
MD5 7728edecc4f1e3bfc7a4aeb6427dd6e7
BLAKE2b-256 874908c973964bcd8855848c60333f93d7d7f1a3b51fb5f6fff1852a7dd1ece3

See more details on using hashes here.

File details

Details for the file tokenizers-0.11.1-cp36-cp36m-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for tokenizers-0.11.1-cp36-cp36m-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 34ed01e6b5295762e2636ac9a9c3bb54aed302bd1720800901ecf688bf26a87d
MD5 3f9666cd38e2ab6b52682d5be97512ff
BLAKE2b-256 f05b7e0671a5527d0835ba74c3e411e89922d8a6d51ca49bbd5137192a85797a

See more details on using hashes here.

File details

Details for the file tokenizers-0.11.1-cp36-cp36m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tokenizers-0.11.1-cp36-cp36m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 28cdeaeadab0393c1280bed66fca16a37d834b54325e6158a79e2e72826a127a
MD5 08c06fe4fa2cc0eb413daade1a6f102d
BLAKE2b-256 745deac74ced09dc677ce5f307e626f39640030def6f1ae0f5d382ab82753d18

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page