Skip to main content

Fast and Customizable Tokenizers

Project description



Build GitHub


Tokenizers

Provides an implementation of today's most used tokenizers, with a focus on performance and versatility.

Bindings over the Rust implementation. If you are interested in the High-level design, you can go check it there.

Otherwise, let's dive in!

Main features:

  • Train new vocabularies and tokenize using 4 pre-made tokenizers (Bert WordPiece and the 3 most common BPE versions).
  • Extremely fast (both training and tokenization), thanks to the Rust implementation. Takes less than 20 seconds to tokenize a GB of text on a server's CPU.
  • Easy to use, but also extremely versatile.
  • Designed for research and production.
  • Normalization comes with alignments tracking. It's always possible to get the part of the original sentence that corresponds to a given token.
  • Does all the pre-processing: Truncate, Pad, add the special tokens your model needs.

Installation

With pip:

pip install tokenizers

From sources:

To use this method, you need to have the Rust installed:

# Install with:
curl https://sh.rustup.rs -sSf | sh -s -- -y
export PATH="$HOME/.cargo/bin:$PATH"

Once Rust is installed, you can compile doing the following

git clone https://github.com/huggingface/tokenizers
cd tokenizers/bindings/python

# Create a virtual env (you can use yours as well)
python -m venv .env
source .env/bin/activate

# Install `tokenizers` in the current virtual env
pip install setuptools_rust
python setup.py install

Load a pretrained tokenizer from the Hub

from tokenizers import Tokenizer

tokenizer = Tokenizer.from_pretrained("bert-base-cased")

Using the provided Tokenizers

We provide some pre-build tokenizers to cover the most common cases. You can easily load one of these using some vocab.json and merges.txt files:

from tokenizers import CharBPETokenizer

# Initialize a tokenizer
vocab = "./path/to/vocab.json"
merges = "./path/to/merges.txt"
tokenizer = CharBPETokenizer(vocab, merges)

# And then encode:
encoded = tokenizer.encode("I can feel the magic, can you?")
print(encoded.ids)
print(encoded.tokens)

And you can train them just as simply:

from tokenizers import CharBPETokenizer

# Initialize a tokenizer
tokenizer = CharBPETokenizer()

# Then train it!
tokenizer.train([ "./path/to/files/1.txt", "./path/to/files/2.txt" ])

# Now, let's use it:
encoded = tokenizer.encode("I can feel the magic, can you?")

# And finally save it somewhere
tokenizer.save("./path/to/directory/my-bpe.tokenizer.json")

Provided Tokenizers

  • CharBPETokenizer: The original BPE
  • ByteLevelBPETokenizer: The byte level version of the BPE
  • SentencePieceBPETokenizer: A BPE implementation compatible with the one used by SentencePiece
  • BertWordPieceTokenizer: The famous Bert tokenizer, using WordPiece

All of these can be used and trained as explained above!

Build your own

Whenever these provided tokenizers don't give you enough freedom, you can build your own tokenizer, by putting all the different parts you need together. You can check how we implemented the provided tokenizers and adapt them easily to your own needs.

Building a byte-level BPE

Here is an example showing how to build your own byte-level BPE by putting all the different pieces together, and then saving it to a single file:

from tokenizers import Tokenizer, models, pre_tokenizers, decoders, trainers, processors

# Initialize a tokenizer
tokenizer = Tokenizer(models.BPE())

# Customize pre-tokenization and decoding
tokenizer.pre_tokenizer = pre_tokenizers.ByteLevel(add_prefix_space=True)
tokenizer.decoder = decoders.ByteLevel()
tokenizer.post_processor = processors.ByteLevel(trim_offsets=True)

# And then train
trainer = trainers.BpeTrainer(
    vocab_size=20000,
    min_frequency=2,
    initial_alphabet=pre_tokenizers.ByteLevel.alphabet()
)
tokenizer.train([
    "./path/to/dataset/1.txt",
    "./path/to/dataset/2.txt",
    "./path/to/dataset/3.txt"
], trainer=trainer)

# And Save it
tokenizer.save("byte-level-bpe.tokenizer.json", pretty=True)

Now, when you want to use this tokenizer, this is as simple as:

from tokenizers import Tokenizer

tokenizer = Tokenizer.from_file("byte-level-bpe.tokenizer.json")

encoded = tokenizer.encode("I can feel the magic, can you?")

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tokenizers-0.11.2.tar.gz (216.8 kB view details)

Uploaded Source

Built Distributions

tokenizers-0.11.2-cp39-cp39-win_amd64.whl (3.3 MB view details)

Uploaded CPython 3.9 Windows x86-64

tokenizers-0.11.2-cp39-cp39-win32.whl (3.0 MB view details)

Uploaded CPython 3.9 Windows x86

tokenizers-0.11.2-cp39-cp39-manylinux_2_17_s390x.manylinux2014_s390x.whl (7.7 MB view details)

Uploaded CPython 3.9 manylinux: glibc 2.17+ s390x

tokenizers-0.11.2-cp39-cp39-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl (8.0 MB view details)

Uploaded CPython 3.9 manylinux: glibc 2.17+ ppc64le

tokenizers-0.11.2-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (7.0 MB view details)

Uploaded CPython 3.9 manylinux: glibc 2.17+ ARM64

tokenizers-0.11.2-cp39-cp39-manylinux_2_12_x86_64.manylinux2010_x86_64.whl (6.8 MB view details)

Uploaded CPython 3.9 manylinux: glibc 2.12+ x86-64

tokenizers-0.11.2-cp39-cp39-macosx_10_11_x86_64.whl (3.7 MB view details)

Uploaded CPython 3.9 macOS 10.11+ x86-64

tokenizers-0.11.2-cp38-cp38-win_amd64.whl (3.3 MB view details)

Uploaded CPython 3.8 Windows x86-64

tokenizers-0.11.2-cp38-cp38-win32.whl (3.0 MB view details)

Uploaded CPython 3.8 Windows x86

tokenizers-0.11.2-cp38-cp38-manylinux_2_17_s390x.manylinux2014_s390x.whl (7.7 MB view details)

Uploaded CPython 3.8 manylinux: glibc 2.17+ s390x

tokenizers-0.11.2-cp38-cp38-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl (8.0 MB view details)

Uploaded CPython 3.8 manylinux: glibc 2.17+ ppc64le

tokenizers-0.11.2-cp38-cp38-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (7.0 MB view details)

Uploaded CPython 3.8 manylinux: glibc 2.17+ ARM64

tokenizers-0.11.2-cp38-cp38-manylinux_2_12_x86_64.manylinux2010_x86_64.whl (6.8 MB view details)

Uploaded CPython 3.8 manylinux: glibc 2.12+ x86-64

tokenizers-0.11.2-cp38-cp38-macosx_10_11_x86_64.whl (3.7 MB view details)

Uploaded CPython 3.8 macOS 10.11+ x86-64

tokenizers-0.11.2-cp37-cp37m-win_amd64.whl (3.3 MB view details)

Uploaded CPython 3.7m Windows x86-64

tokenizers-0.11.2-cp37-cp37m-win32.whl (3.0 MB view details)

Uploaded CPython 3.7m Windows x86

tokenizers-0.11.2-cp37-cp37m-manylinux_2_17_s390x.manylinux2014_s390x.whl (7.7 MB view details)

Uploaded CPython 3.7m manylinux: glibc 2.17+ s390x

tokenizers-0.11.2-cp37-cp37m-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl (8.0 MB view details)

Uploaded CPython 3.7m manylinux: glibc 2.17+ ppc64le

tokenizers-0.11.2-cp37-cp37m-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (6.9 MB view details)

Uploaded CPython 3.7m manylinux: glibc 2.17+ ARM64

tokenizers-0.11.2-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl (6.8 MB view details)

Uploaded CPython 3.7m manylinux: glibc 2.12+ x86-64

tokenizers-0.11.2-cp37-cp37m-macosx_10_11_x86_64.whl (3.7 MB view details)

Uploaded CPython 3.7m macOS 10.11+ x86-64

tokenizers-0.11.2-cp36-cp36m-manylinux_2_17_s390x.manylinux2014_s390x.whl (7.7 MB view details)

Uploaded CPython 3.6m manylinux: glibc 2.17+ s390x

tokenizers-0.11.2-cp36-cp36m-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl (8.0 MB view details)

Uploaded CPython 3.6m manylinux: glibc 2.17+ ppc64le

tokenizers-0.11.2-cp36-cp36m-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (7.0 MB view details)

Uploaded CPython 3.6m manylinux: glibc 2.17+ ARM64

tokenizers-0.11.2-cp36-cp36m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl (6.8 MB view details)

Uploaded CPython 3.6m manylinux: glibc 2.12+ x86-64

File details

Details for the file tokenizers-0.11.2.tar.gz.

File metadata

  • Download URL: tokenizers-0.11.2.tar.gz
  • Upload date:
  • Size: 216.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.7.1 importlib_metadata/4.10.0 pkginfo/1.8.2 requests/2.27.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.10.1

File hashes

Hashes for tokenizers-0.11.2.tar.gz
Algorithm Hash digest
SHA256 15f078682d29d0de1ae3a40c553acd6e3a8152174ea4705f6c72ea8f25cdec8e
MD5 4a6116b52ce0d6c5e35a2021383fd103
BLAKE2b-256 88a48affb56f536185767491675b307fdde10cf51420559a41464a4171c8287f

See more details on using hashes here.

File details

Details for the file tokenizers-0.11.2-cp39-cp39-win_amd64.whl.

File metadata

  • Download URL: tokenizers-0.11.2-cp39-cp39-win_amd64.whl
  • Upload date:
  • Size: 3.3 MB
  • Tags: CPython 3.9, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.7.1 importlib_metadata/4.10.0 pkginfo/1.8.2 requests/2.27.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.10.1

File hashes

Hashes for tokenizers-0.11.2-cp39-cp39-win_amd64.whl
Algorithm Hash digest
SHA256 f1c0239452f9ffed122be0fb1a3db39c7a3c9e0ed1b7f4057f404bcd48a5da77
MD5 944611cf0303179e5394760ba91f4223
BLAKE2b-256 c5aedeaf56b07f65adc2a0b4b267d0d03cca26cceeb12a2611c6d3a03a25c150

See more details on using hashes here.

File details

Details for the file tokenizers-0.11.2-cp39-cp39-win32.whl.

File metadata

  • Download URL: tokenizers-0.11.2-cp39-cp39-win32.whl
  • Upload date:
  • Size: 3.0 MB
  • Tags: CPython 3.9, Windows x86
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.7.1 importlib_metadata/4.10.0 pkginfo/1.8.2 requests/2.27.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.10.1

File hashes

Hashes for tokenizers-0.11.2-cp39-cp39-win32.whl
Algorithm Hash digest
SHA256 ccd2cda41c6f8ec22d410bb21be2cbfb2d98a5a92efe6fbad6e5f31c5d9149ac
MD5 409a9fc24eeb7c7ce1589f40cb43f3f6
BLAKE2b-256 f4e90ce799fb350334d0e15c87880da5e2621b68ff62c8f483b2cea1fce7c9a3

See more details on using hashes here.

File details

Details for the file tokenizers-0.11.2-cp39-cp39-manylinux_2_17_s390x.manylinux2014_s390x.whl.

File metadata

File hashes

Hashes for tokenizers-0.11.2-cp39-cp39-manylinux_2_17_s390x.manylinux2014_s390x.whl
Algorithm Hash digest
SHA256 d7dcf83347fc585334ec85ba9877565be9c6a44a9266204eb6e114e945c87223
MD5 6b021d5d4cd8ca510f1d32062c4458b7
BLAKE2b-256 fd7a7d93f7cdbdaa837d891ee71f4e5e21989f7cc7696e4e51944cb7d7bb57ca

See more details on using hashes here.

File details

Details for the file tokenizers-0.11.2-cp39-cp39-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl.

File metadata

File hashes

Hashes for tokenizers-0.11.2-cp39-cp39-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl
Algorithm Hash digest
SHA256 c01da0f7c2724fe8d91e367972bc2136910e1739fbe497afe2419f1aa0aee235
MD5 f289ed726b1150cf4ab9b97f2307a5fc
BLAKE2b-256 6a700ee6d783148102db3f5461ab9715b1dba791f1545f9d171f78bd1b70ce42

See more details on using hashes here.

File details

Details for the file tokenizers-0.11.2-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for tokenizers-0.11.2-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 1bf19f6b01d16f670d8480629d7ed8212c59668b88d6939dbf033b355ea9d47f
MD5 705cdbf854b17649f246c5d61a59971d
BLAKE2b-256 a6b862950172fe3747afca82b317316fb648b94411cec3ba9e10c3f882124739

See more details on using hashes here.

File details

Details for the file tokenizers-0.11.2-cp39-cp39-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tokenizers-0.11.2-cp39-cp39-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 47eb858a957a421ded06ec09d5fca82dc975d180eae42a4a03cc9924182f8805
MD5 71b7c0356213a973be2fe71e8f5287a8
BLAKE2b-256 19e1a7dd3fd2e0ace68644dab916022d68d390a8d4b87d8b3a46b195f1adabcf

See more details on using hashes here.

File details

Details for the file tokenizers-0.11.2-cp39-cp39-macosx_10_11_x86_64.whl.

File metadata

  • Download URL: tokenizers-0.11.2-cp39-cp39-macosx_10_11_x86_64.whl
  • Upload date:
  • Size: 3.7 MB
  • Tags: CPython 3.9, macOS 10.11+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.7.1 importlib_metadata/4.10.0 pkginfo/1.8.2 requests/2.27.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.10.1

File hashes

Hashes for tokenizers-0.11.2-cp39-cp39-macosx_10_11_x86_64.whl
Algorithm Hash digest
SHA256 6c4ad1d738ee23fff48562bd854c94d749a251ef862387e70cdfe76c51c474bf
MD5 6ab953c13fc31eb8633f3e227808a3e0
BLAKE2b-256 e3074b7ce45ee711f1552a63b1c323a8399f75ed408cecd984c080af2d00c6a3

See more details on using hashes here.

File details

Details for the file tokenizers-0.11.2-cp38-cp38-win_amd64.whl.

File metadata

  • Download URL: tokenizers-0.11.2-cp38-cp38-win_amd64.whl
  • Upload date:
  • Size: 3.3 MB
  • Tags: CPython 3.8, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.7.1 importlib_metadata/4.10.0 pkginfo/1.8.2 requests/2.27.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.10.1

File hashes

Hashes for tokenizers-0.11.2-cp38-cp38-win_amd64.whl
Algorithm Hash digest
SHA256 620d04383b0e5b56b0f367a2bb9e34d07ef9c6ac6683326ce6352f4fde995052
MD5 d7b88c27955abe2b088b60773f75aadb
BLAKE2b-256 e7031e1bdfe6a85b826cad51dc0d14a8fef193c387db1ea8f30692dfdc68489a

See more details on using hashes here.

File details

Details for the file tokenizers-0.11.2-cp38-cp38-win32.whl.

File metadata

  • Download URL: tokenizers-0.11.2-cp38-cp38-win32.whl
  • Upload date:
  • Size: 3.0 MB
  • Tags: CPython 3.8, Windows x86
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.7.1 importlib_metadata/4.10.0 pkginfo/1.8.2 requests/2.27.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.10.1

File hashes

Hashes for tokenizers-0.11.2-cp38-cp38-win32.whl
Algorithm Hash digest
SHA256 1950cb91de440761c046c7ff09e8a9b5c33b52e37b91439f81b529c2f5752c56
MD5 757bdff5695b61a58c0e770fd51bf5bf
BLAKE2b-256 2e31fc140b663fd7313eb642e859d75d2e38ac2f420d744a49dd2bc1e46b4a71

See more details on using hashes here.

File details

Details for the file tokenizers-0.11.2-cp38-cp38-manylinux_2_17_s390x.manylinux2014_s390x.whl.

File metadata

File hashes

Hashes for tokenizers-0.11.2-cp38-cp38-manylinux_2_17_s390x.manylinux2014_s390x.whl
Algorithm Hash digest
SHA256 ef72411bbb09624b07a844d877c979fe719b6335702e653f911e22a1fc450e04
MD5 7122731b82c66733653a9d74a90d60f3
BLAKE2b-256 bd691734df8eebad9a0da1fe59d1a125d50af89a89ec456614d1122cf6eedae3

See more details on using hashes here.

File details

Details for the file tokenizers-0.11.2-cp38-cp38-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl.

File metadata

File hashes

Hashes for tokenizers-0.11.2-cp38-cp38-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl
Algorithm Hash digest
SHA256 ddd6959019183f3c7713df50b625027444f39bf87456c1c3e1eeb87cc0623e5f
MD5 ba61f6574556064abe1780ec4a25ce93
BLAKE2b-256 9ec56c28a29e7745dd54351bad58e32e9225960902719e9cce2c55055f842649

See more details on using hashes here.

File details

Details for the file tokenizers-0.11.2-cp38-cp38-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for tokenizers-0.11.2-cp38-cp38-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 2f0d02400b19f55d823a4ac88b526bd9e7cc4f88537c6109c5732c2f48a0bd4b
MD5 1e342d25245e7c33f27f665e77f4decd
BLAKE2b-256 bccd9581116b4cb4325dc9be51b5f60fff52e8096c5c0328434d524cc03db326

See more details on using hashes here.

File details

Details for the file tokenizers-0.11.2-cp38-cp38-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tokenizers-0.11.2-cp38-cp38-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 2105746c02a82e2f91371e52ab2936416db76894ae03e7b6bc29925eb41fef33
MD5 0e771bf1b788730a37513ffa3fbb4f50
BLAKE2b-256 dea9321ea5184b1c8f1c043fea3d5c9eedc1f53819030443d9d09b30865d739a

See more details on using hashes here.

File details

Details for the file tokenizers-0.11.2-cp38-cp38-macosx_10_11_x86_64.whl.

File metadata

  • Download URL: tokenizers-0.11.2-cp38-cp38-macosx_10_11_x86_64.whl
  • Upload date:
  • Size: 3.7 MB
  • Tags: CPython 3.8, macOS 10.11+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.7.1 importlib_metadata/4.10.0 pkginfo/1.8.2 requests/2.27.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.10.1

File hashes

Hashes for tokenizers-0.11.2-cp38-cp38-macosx_10_11_x86_64.whl
Algorithm Hash digest
SHA256 da9015c56c5bc04c4ccd216983a1c6dd669dbf5ca43ab55d82115b8c2d5e1468
MD5 4f5b5c5c7e343db65fc4bd0c9a2476e8
BLAKE2b-256 653da03ccbb97f09cfbef4572e0d659469888a671ff0633d8fb966088c445164

See more details on using hashes here.

File details

Details for the file tokenizers-0.11.2-cp37-cp37m-win_amd64.whl.

File metadata

  • Download URL: tokenizers-0.11.2-cp37-cp37m-win_amd64.whl
  • Upload date:
  • Size: 3.3 MB
  • Tags: CPython 3.7m, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.7.1 importlib_metadata/4.10.0 pkginfo/1.8.2 requests/2.27.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.10.1

File hashes

Hashes for tokenizers-0.11.2-cp37-cp37m-win_amd64.whl
Algorithm Hash digest
SHA256 e5e2393c0a90a2875005cc4bf080cc306898657083ed219b164ab2f645cb6720
MD5 5bf4df3e6e096b75756948f6c30229c1
BLAKE2b-256 08c94943bd04e7187442d7d06f4f7ee5295b451c0ba57f2eb2fca284ad11ab3d

See more details on using hashes here.

File details

Details for the file tokenizers-0.11.2-cp37-cp37m-win32.whl.

File metadata

  • Download URL: tokenizers-0.11.2-cp37-cp37m-win32.whl
  • Upload date:
  • Size: 3.0 MB
  • Tags: CPython 3.7m, Windows x86
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.7.1 importlib_metadata/4.10.0 pkginfo/1.8.2 requests/2.27.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.10.1

File hashes

Hashes for tokenizers-0.11.2-cp37-cp37m-win32.whl
Algorithm Hash digest
SHA256 552198a5d70cbfcacab38662836f29ed6b3d5bf491df7ed310933e33ca4b581e
MD5 595458c302357bb8cf6c6e08efc6accc
BLAKE2b-256 444ef6f6a3a8197d171a1349c7ee8800c225fef81f6d85ad938a67c1be6b9e43

See more details on using hashes here.

File details

Details for the file tokenizers-0.11.2-cp37-cp37m-manylinux_2_17_s390x.manylinux2014_s390x.whl.

File metadata

File hashes

Hashes for tokenizers-0.11.2-cp37-cp37m-manylinux_2_17_s390x.manylinux2014_s390x.whl
Algorithm Hash digest
SHA256 62e42740e2d37bdf09305885a7bda957e1c1251ea918a9d0cb9fda71de72f637
MD5 5e93f77ea806ee53acc2706596815218
BLAKE2b-256 235fcd3112c6fdd9fc7eed5e4e552351da61d97607145a41162efbd9b288fbb6

See more details on using hashes here.

File details

Details for the file tokenizers-0.11.2-cp37-cp37m-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl.

File metadata

File hashes

Hashes for tokenizers-0.11.2-cp37-cp37m-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl
Algorithm Hash digest
SHA256 d02b97429421116e4976e19a2208b9f0975eed2ac0c08bf2386fe720ce8cb5d5
MD5 6240bfec0c535792e41b149a83798eeb
BLAKE2b-256 753b3c3e0e8cc112ce7525ea3101acf49c74f17b897f6e7c49ee1cb267d84ff2

See more details on using hashes here.

File details

Details for the file tokenizers-0.11.2-cp37-cp37m-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for tokenizers-0.11.2-cp37-cp37m-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 10a1820bc89eac643d36a2ee465191f0368fd4033b9f562e3473e05d718e20c3
MD5 d97b1cdfc4c9e59b0ebb03bba9ea5e3f
BLAKE2b-256 06763182b744b62bf25ca6ac186b631058460498b5dd1191cc28d694ac06631a

See more details on using hashes here.

File details

Details for the file tokenizers-0.11.2-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tokenizers-0.11.2-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 fad5ce5c2c7b636a12676ae206e1c9eafac063ff448c6cad7d901dc1d8518ac0
MD5 e6624abf99d2b69576903881afe9456b
BLAKE2b-256 3e3f8b394e3cd3f2812756f69d8c1e0da501222b5e4883882d0048419672573d

See more details on using hashes here.

File details

Details for the file tokenizers-0.11.2-cp37-cp37m-macosx_10_11_x86_64.whl.

File metadata

  • Download URL: tokenizers-0.11.2-cp37-cp37m-macosx_10_11_x86_64.whl
  • Upload date:
  • Size: 3.7 MB
  • Tags: CPython 3.7m, macOS 10.11+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.7.1 importlib_metadata/4.10.0 pkginfo/1.8.2 requests/2.27.0 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.10.1

File hashes

Hashes for tokenizers-0.11.2-cp37-cp37m-macosx_10_11_x86_64.whl
Algorithm Hash digest
SHA256 95bdd1432fa2bb739b7a22a62246029a827c82cc17a42131b7a48bb6efe61acd
MD5 886c26458fb37de746931c2c495244f7
BLAKE2b-256 e1f1d36f664fa905f3be272a8367a13c81c41006957f8cb44d054d942611058d

See more details on using hashes here.

File details

Details for the file tokenizers-0.11.2-cp36-cp36m-manylinux_2_17_s390x.manylinux2014_s390x.whl.

File metadata

File hashes

Hashes for tokenizers-0.11.2-cp36-cp36m-manylinux_2_17_s390x.manylinux2014_s390x.whl
Algorithm Hash digest
SHA256 ed8295f6f99287a7ee511d50cbae99bb0587a5b241619e652f9f5b8eb6856b0a
MD5 11f1547388bb608656794fb88f83bcc3
BLAKE2b-256 09d8e6c3befaefb91bee5c2560afa74659294dd44623b6d702f45dd68c07be71

See more details on using hashes here.

File details

Details for the file tokenizers-0.11.2-cp36-cp36m-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl.

File metadata

File hashes

Hashes for tokenizers-0.11.2-cp36-cp36m-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl
Algorithm Hash digest
SHA256 174e477793fc30466d0420534f0a2aa30c73b5575e2fb11e76a03fdaa3b5c674
MD5 34d875e6adbbcc6ee9e8bbd71f28051a
BLAKE2b-256 5cc4d538eb01255c6d3dbc4b8eb78ad58ba1c8e9e18dd9f4653f5713535e4c10

See more details on using hashes here.

File details

Details for the file tokenizers-0.11.2-cp36-cp36m-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for tokenizers-0.11.2-cp36-cp36m-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 788047e9c31a97d50d09d844e0e9b3a7bbc7b2b4ed87f0bd675d8fa8f292b4d3
MD5 a2dbae584165e0d8a1e2402473fb1f9a
BLAKE2b-256 671e0a7a8f403882eefdafe17558d284fc5de2219b5d41a66fcc4250cd8aac8d

See more details on using hashes here.

File details

Details for the file tokenizers-0.11.2-cp36-cp36m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.

File metadata

File hashes

Hashes for tokenizers-0.11.2-cp36-cp36m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 b44f4c0bd3a7009fdf13dca104593d8ff521ebbc68782e13dff1d110b3e05aa4
MD5 49cbc9a2fc7650b86c4ad470ae04c2db
BLAKE2b-256 44c8c4f39adfad33c601a21cf503169524c68d7b33bfbb38be85e62ca2b3a58f

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page