Skip to main content

Fast and Customizable Tokenizers

Project description



Build GitHub


Tokenizers

Provides an implementation of today's most used tokenizers, with a focus on performance and versatility.

Bindings over the Rust implementation. If you are interested in the High-level design, you can go check it there.

Otherwise, let's dive in!

Main features:

  • Train new vocabularies and tokenize using 4 pre-made tokenizers (Bert WordPiece and the 3 most common BPE versions).
  • Extremely fast (both training and tokenization), thanks to the Rust implementation. Takes less than 20 seconds to tokenize a GB of text on a server's CPU.
  • Easy to use, but also extremely versatile.
  • Designed for research and production.
  • Normalization comes with alignments tracking. It's always possible to get the part of the original sentence that corresponds to a given token.
  • Does all the pre-processing: Truncate, Pad, add the special tokens your model needs.

Installation

With pip:

pip install tokenizers

From sources:

To use this method, you need to have the Rust installed:

# Install with:
curl https://sh.rustup.rs -sSf | sh -s -- -y
export PATH="$HOME/.cargo/bin:$PATH"

Once Rust is installed, you can compile doing the following

git clone https://github.com/huggingface/tokenizers
cd tokenizers/bindings/python

# Create a virtual env (you can use yours as well)
python -m venv .env
source .env/bin/activate

# Install `tokenizers` in the current virtual env
pip install setuptools_rust
python setup.py install

Load a pretrained tokenizer from the Hub

from tokenizers import Tokenizer

tokenizer = Tokenizer.from_pretrained("bert-base-cased")

Using the provided Tokenizers

We provide some pre-build tokenizers to cover the most common cases. You can easily load one of these using some vocab.json and merges.txt files:

from tokenizers import CharBPETokenizer

# Initialize a tokenizer
vocab = "./path/to/vocab.json"
merges = "./path/to/merges.txt"
tokenizer = CharBPETokenizer(vocab, merges)

# And then encode:
encoded = tokenizer.encode("I can feel the magic, can you?")
print(encoded.ids)
print(encoded.tokens)

And you can train them just as simply:

from tokenizers import CharBPETokenizer

# Initialize a tokenizer
tokenizer = CharBPETokenizer()

# Then train it!
tokenizer.train([ "./path/to/files/1.txt", "./path/to/files/2.txt" ])

# Now, let's use it:
encoded = tokenizer.encode("I can feel the magic, can you?")

# And finally save it somewhere
tokenizer.save("./path/to/directory/my-bpe.tokenizer.json")

Provided Tokenizers

  • CharBPETokenizer: The original BPE
  • ByteLevelBPETokenizer: The byte level version of the BPE
  • SentencePieceBPETokenizer: A BPE implementation compatible with the one used by SentencePiece
  • BertWordPieceTokenizer: The famous Bert tokenizer, using WordPiece

All of these can be used and trained as explained above!

Build your own

Whenever these provided tokenizers don't give you enough freedom, you can build your own tokenizer, by putting all the different parts you need together. You can check how we implemented the provided tokenizers and adapt them easily to your own needs.

Building a byte-level BPE

Here is an example showing how to build your own byte-level BPE by putting all the different pieces together, and then saving it to a single file:

from tokenizers import Tokenizer, models, pre_tokenizers, decoders, trainers, processors

# Initialize a tokenizer
tokenizer = Tokenizer(models.BPE())

# Customize pre-tokenization and decoding
tokenizer.pre_tokenizer = pre_tokenizers.ByteLevel(add_prefix_space=True)
tokenizer.decoder = decoders.ByteLevel()
tokenizer.post_processor = processors.ByteLevel(trim_offsets=True)

# And then train
trainer = trainers.BpeTrainer(
    vocab_size=20000,
    min_frequency=2,
    initial_alphabet=pre_tokenizers.ByteLevel.alphabet()
)
tokenizer.train([
    "./path/to/dataset/1.txt",
    "./path/to/dataset/2.txt",
    "./path/to/dataset/3.txt"
], trainer=trainer)

# And Save it
tokenizer.save("byte-level-bpe.tokenizer.json", pretty=True)

Now, when you want to use this tokenizer, this is as simple as:

from tokenizers import Tokenizer

tokenizer = Tokenizer.from_file("byte-level-bpe.tokenizer.json")

encoded = tokenizer.encode("I can feel the magic, can you?")

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tokenizers-0.13.1.tar.gz (358.7 kB view details)

Uploaded Source

Built Distributions

tokenizers-0.13.1-cp310-cp310-win_amd64.whl (3.3 MB view details)

Uploaded CPython 3.10 Windows x86-64

tokenizers-0.13.1-cp310-cp310-win32.whl (3.0 MB view details)

Uploaded CPython 3.10 Windows x86

tokenizers-0.13.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (7.6 MB view details)

Uploaded CPython 3.10 manylinux: glibc 2.17+ x86-64

tokenizers-0.13.1-cp310-cp310-manylinux_2_17_s390x.manylinux2014_s390x.whl (7.9 MB view details)

Uploaded CPython 3.10 manylinux: glibc 2.17+ s390x

tokenizers-0.13.1-cp310-cp310-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl (8.3 MB view details)

Uploaded CPython 3.10 manylinux: glibc 2.17+ ppc64le

tokenizers-0.13.1-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (7.2 MB view details)

Uploaded CPython 3.10 manylinux: glibc 2.17+ ARM64

tokenizers-0.13.1-cp310-cp310-macosx_12_0_arm64.whl (3.6 MB view details)

Uploaded CPython 3.10 macOS 12.0+ ARM64

tokenizers-0.13.1-cp310-cp310-macosx_10_11_x86_64.whl (3.8 MB view details)

Uploaded CPython 3.10 macOS 10.11+ x86-64

tokenizers-0.13.1-cp39-cp39-win_amd64.whl (3.3 MB view details)

Uploaded CPython 3.9 Windows x86-64

tokenizers-0.13.1-cp39-cp39-win32.whl (3.0 MB view details)

Uploaded CPython 3.9 Windows x86

tokenizers-0.13.1-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (7.6 MB view details)

Uploaded CPython 3.9 manylinux: glibc 2.17+ x86-64

tokenizers-0.13.1-cp39-cp39-manylinux_2_17_s390x.manylinux2014_s390x.whl (7.9 MB view details)

Uploaded CPython 3.9 manylinux: glibc 2.17+ s390x

tokenizers-0.13.1-cp39-cp39-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl (8.3 MB view details)

Uploaded CPython 3.9 manylinux: glibc 2.17+ ppc64le

tokenizers-0.13.1-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (7.2 MB view details)

Uploaded CPython 3.9 manylinux: glibc 2.17+ ARM64

tokenizers-0.13.1-cp39-cp39-macosx_12_0_arm64.whl (3.6 MB view details)

Uploaded CPython 3.9 macOS 12.0+ ARM64

tokenizers-0.13.1-cp39-cp39-macosx_10_11_x86_64.whl (3.8 MB view details)

Uploaded CPython 3.9 macOS 10.11+ x86-64

tokenizers-0.13.1-cp38-cp38-win_amd64.whl (3.3 MB view details)

Uploaded CPython 3.8 Windows x86-64

tokenizers-0.13.1-cp38-cp38-win32.whl (3.0 MB view details)

Uploaded CPython 3.8 Windows x86

tokenizers-0.13.1-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (7.6 MB view details)

Uploaded CPython 3.8 manylinux: glibc 2.17+ x86-64

tokenizers-0.13.1-cp38-cp38-manylinux_2_17_s390x.manylinux2014_s390x.whl (7.9 MB view details)

Uploaded CPython 3.8 manylinux: glibc 2.17+ s390x

tokenizers-0.13.1-cp38-cp38-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl (8.3 MB view details)

Uploaded CPython 3.8 manylinux: glibc 2.17+ ppc64le

tokenizers-0.13.1-cp38-cp38-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (7.2 MB view details)

Uploaded CPython 3.8 manylinux: glibc 2.17+ ARM64

tokenizers-0.13.1-cp38-cp38-macosx_10_11_x86_64.whl (3.8 MB view details)

Uploaded CPython 3.8 macOS 10.11+ x86-64

tokenizers-0.13.1-cp37-cp37m-win_amd64.whl (3.3 MB view details)

Uploaded CPython 3.7m Windows x86-64

tokenizers-0.13.1-cp37-cp37m-win32.whl (3.0 MB view details)

Uploaded CPython 3.7m Windows x86

tokenizers-0.13.1-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (7.6 MB view details)

Uploaded CPython 3.7m manylinux: glibc 2.17+ x86-64

tokenizers-0.13.1-cp37-cp37m-manylinux_2_17_s390x.manylinux2014_s390x.whl (7.9 MB view details)

Uploaded CPython 3.7m manylinux: glibc 2.17+ s390x

tokenizers-0.13.1-cp37-cp37m-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl (8.3 MB view details)

Uploaded CPython 3.7m manylinux: glibc 2.17+ ppc64le

tokenizers-0.13.1-cp37-cp37m-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (7.2 MB view details)

Uploaded CPython 3.7m manylinux: glibc 2.17+ ARM64

tokenizers-0.13.1-cp37-cp37m-macosx_10_11_x86_64.whl (3.8 MB view details)

Uploaded CPython 3.7m macOS 10.11+ x86-64

File details

Details for the file tokenizers-0.13.1.tar.gz.

File metadata

  • Download URL: tokenizers-0.13.1.tar.gz
  • Upload date:
  • Size: 358.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.10.7

File hashes

Hashes for tokenizers-0.13.1.tar.gz
Algorithm Hash digest
SHA256 3333d1cee5c8f47c96362ea0abc1f81c77c9b92c6c3d11cbf1d01985f0d5cf1d
MD5 5954bac793da225182a69fff2d3f8f8c
BLAKE2b-256 3d665296f3610be7a4aaefa0b602351f63115f3945f725b5e37c2634c578a293

See more details on using hashes here.

File details

Details for the file tokenizers-0.13.1-cp310-cp310-win_amd64.whl.

File metadata

File hashes

Hashes for tokenizers-0.13.1-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 d8fca8b492a4697b0182e0c40b164cb0c44a9669d9c98033fec2f88174605eb0
MD5 2787f021cd31b8a424d43c9d9cdb753b
BLAKE2b-256 0b5baca30a45806eccdbfd40e15523acb697af3ca6e43fceb9901facd39da622

See more details on using hashes here.

File details

Details for the file tokenizers-0.13.1-cp310-cp310-win32.whl.

File metadata

File hashes

Hashes for tokenizers-0.13.1-cp310-cp310-win32.whl
Algorithm Hash digest
SHA256 98bef54cf51ac335fda1408112df7ff3e584107633bd9066616033e12b0bd519
MD5 458bbcd923639ada26306597ba8a003f
BLAKE2b-256 ed0e7e341f6fb44ebdb5ba99b68869beeb815b96017c9657f0d01cd8dbec772a

See more details on using hashes here.

File details

Details for the file tokenizers-0.13.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for tokenizers-0.13.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 680bc0e357b7da6d0d01634bffbd002e866fdaccde303e1d1af58f32464cf308
MD5 b624cb190390b1079cd5c0246fcc02ac
BLAKE2b-256 59d2e90fee4381e41ca0a1d3cbeae1f98343e46ac050909f608846114e9b7992

See more details on using hashes here.

File details

Details for the file tokenizers-0.13.1-cp310-cp310-manylinux_2_17_s390x.manylinux2014_s390x.whl.

File metadata

File hashes

Hashes for tokenizers-0.13.1-cp310-cp310-manylinux_2_17_s390x.manylinux2014_s390x.whl
Algorithm Hash digest
SHA256 fea71780b66f8c278ebae7221c8959404cf7343b8d2f4b7308aa668cf6f02364
MD5 3c8b4cc5c0ee1879b5eacbbf8ca25ce9
BLAKE2b-256 69d5e0c8992585e41074a061e0a70b19e0cf9937618608044e1eaa9ec4384a43

See more details on using hashes here.

File details

Details for the file tokenizers-0.13.1-cp310-cp310-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl.

File metadata

File hashes

Hashes for tokenizers-0.13.1-cp310-cp310-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl
Algorithm Hash digest
SHA256 c73b9e6c107e980e65077b89c54311d8d645f6a9efdde82990777fa43c0a8cae
MD5 f94b674bf20d2d385df8aad8cc9eec19
BLAKE2b-256 d3d96c2dc7eef13ec8a6f0c519108700002dcb0eafbf98e109c2971015f696a1

See more details on using hashes here.

File details

Details for the file tokenizers-0.13.1-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for tokenizers-0.13.1-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 afcb1bd6d9ed59d5c8e633765589cab12f98aae09804f156b5965b4463b8b8e3
MD5 0279811286f3e0048d243fba64be96f8
BLAKE2b-256 d0c6569bd002fdeecba15be1860f83c678418ee2e169877b0b2e2ee5d45adb26

See more details on using hashes here.

File details

Details for the file tokenizers-0.13.1-cp310-cp310-macosx_12_0_arm64.whl.

File metadata

File hashes

Hashes for tokenizers-0.13.1-cp310-cp310-macosx_12_0_arm64.whl
Algorithm Hash digest
SHA256 0a3412830ad66a643723d6b0fc3202e64e9e299bd9c9eda04b2914b5b1e0ddb0
MD5 63abff2b573a564d34e49aab8fb506af
BLAKE2b-256 cd22b9bb2e77a29de0a89c29a63f41514720a21a21ca17fc558167626db20a6c

See more details on using hashes here.

File details

Details for the file tokenizers-0.13.1-cp310-cp310-macosx_10_11_x86_64.whl.

File metadata

File hashes

Hashes for tokenizers-0.13.1-cp310-cp310-macosx_10_11_x86_64.whl
Algorithm Hash digest
SHA256 70de69681a264a5808d39f4bb6331be9a4dec51fd48cd1b959a94da76c4939cc
MD5 d6b01a13a74045c6dd1eeffd4df2c297
BLAKE2b-256 425f2d65b98159b053b28f18d68be048d0937b20ea242ecfe4bc59fe5bb25945

See more details on using hashes here.

File details

Details for the file tokenizers-0.13.1-cp39-cp39-win_amd64.whl.

File metadata

File hashes

Hashes for tokenizers-0.13.1-cp39-cp39-win_amd64.whl
Algorithm Hash digest
SHA256 3ba43b3f6f6b41c97c1d041785342cd72ba142999f6c4605d628e8e937398f20
MD5 c39f97fa68336470a14e2da7708d732f
BLAKE2b-256 87b73aeae88e4c5e39bd310366f3eccfc13915ed96bb26c969ddce6303d5867b

See more details on using hashes here.

File details

Details for the file tokenizers-0.13.1-cp39-cp39-win32.whl.

File metadata

  • Download URL: tokenizers-0.13.1-cp39-cp39-win32.whl
  • Upload date:
  • Size: 3.0 MB
  • Tags: CPython 3.9, Windows x86
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.10.7

File hashes

Hashes for tokenizers-0.13.1-cp39-cp39-win32.whl
Algorithm Hash digest
SHA256 890d2139100b5c8ac6d585438d5e145ede1d7b32b4090a6c078db6db0ef1daea
MD5 44fdc9c2503ba7c81ddf7fd74d83536a
BLAKE2b-256 42a9e1f9054a11433b8ec8235f092a315c4670a417917b3ae20fdd754820f83a

See more details on using hashes here.

File details

Details for the file tokenizers-0.13.1-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for tokenizers-0.13.1-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 3de653a551cc616a442a123da21706cb3a3163cf6919973f978f0921eee1bdf0
MD5 46f03a7135192d9fb14b130cfd5b970f
BLAKE2b-256 2819d5e81d3f0e4d655916cf8ae34c58079472a7ba12d11c42c6f4df3d43f099

See more details on using hashes here.

File details

Details for the file tokenizers-0.13.1-cp39-cp39-manylinux_2_17_s390x.manylinux2014_s390x.whl.

File metadata

File hashes

Hashes for tokenizers-0.13.1-cp39-cp39-manylinux_2_17_s390x.manylinux2014_s390x.whl
Algorithm Hash digest
SHA256 84fa41b58a8d3b7363ecdf3397d4b38f345fcf7d4dd14565b4360e7bffc9cae0
MD5 a59872ce0b2e49ceff1258456fdb1591
BLAKE2b-256 1f56b959250f5f05ba78dd02d8b2215479dc45a5c1953cea67455f6b0d445df9

See more details on using hashes here.

File details

Details for the file tokenizers-0.13.1-cp39-cp39-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl.

File metadata

File hashes

Hashes for tokenizers-0.13.1-cp39-cp39-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl
Algorithm Hash digest
SHA256 80b9552295fdce0a2513dcb795a3f8591eca1a8dcf8afe0de3214209e6924ad1
MD5 36e2a7db6b5e47cbba592ad1de57f291
BLAKE2b-256 124a6ab7e85d3c42ab11b3f5f6431034eba683b8400afb9019ba6f04c3960f49

See more details on using hashes here.

File details

Details for the file tokenizers-0.13.1-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for tokenizers-0.13.1-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 b72dec85488af3e1e8d58fb4b86b5dbe5171c176002b5e098ea6d52a70004bb5
MD5 9ed43c133bd75730f6f2ad4e3b9b1207
BLAKE2b-256 676d9973439de7a6a5730ebcb86b6eb890845c898c0719aed9663da91339cee6

See more details on using hashes here.

File details

Details for the file tokenizers-0.13.1-cp39-cp39-macosx_12_0_arm64.whl.

File metadata

File hashes

Hashes for tokenizers-0.13.1-cp39-cp39-macosx_12_0_arm64.whl
Algorithm Hash digest
SHA256 1d4acfdb6e7ef974677bb8445462db7fed240185fdc0f5b061db357d4ef8d85d
MD5 cad71ce32d5c11fa948bda30aeeee90f
BLAKE2b-256 f2a28d5aa04423044ce13e30825b0b922183c130a38d5fb7f4be76fae5786646

See more details on using hashes here.

File details

Details for the file tokenizers-0.13.1-cp39-cp39-macosx_10_11_x86_64.whl.

File metadata

File hashes

Hashes for tokenizers-0.13.1-cp39-cp39-macosx_10_11_x86_64.whl
Algorithm Hash digest
SHA256 910479e92d5fbdf91e8106b4c658fd43d418893d7cfd5fb11983c54a1ff53869
MD5 9797dd295ea02e7f199466ee4a1a21be
BLAKE2b-256 ca57c9e3e5571fdff67b3a469a8903eef681df370650f8b3fe3bf1aa48519757

See more details on using hashes here.

File details

Details for the file tokenizers-0.13.1-cp38-cp38-win_amd64.whl.

File metadata

File hashes

Hashes for tokenizers-0.13.1-cp38-cp38-win_amd64.whl
Algorithm Hash digest
SHA256 e1a90bc97f53600f52e902f3ae097424de712d8ae0e42d957efc7ed908573a20
MD5 5dbc1609b2790745ad86c8a79193861b
BLAKE2b-256 1e3c6ca7434a15044e43f7a6f5614a268e06f5c6eaeb805ffb26992067c81daa

See more details on using hashes here.

File details

Details for the file tokenizers-0.13.1-cp38-cp38-win32.whl.

File metadata

  • Download URL: tokenizers-0.13.1-cp38-cp38-win32.whl
  • Upload date:
  • Size: 3.0 MB
  • Tags: CPython 3.8, Windows x86
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.10.7

File hashes

Hashes for tokenizers-0.13.1-cp38-cp38-win32.whl
Algorithm Hash digest
SHA256 73198cda6e1d991c583ed798514863e16763aa600eb7aa6df7722373290575b2
MD5 cc518056878f0adab23f459d0b333c3d
BLAKE2b-256 c1a182e1511da6f17077f8b670b8f8d93270821c52f52b63bed357150a53b0c1

See more details on using hashes here.

File details

Details for the file tokenizers-0.13.1-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for tokenizers-0.13.1-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 f75f476fe183c03c515a0f0f5d195cb05d93fcdc76e31fe3c9753d01f3ee990b
MD5 4adf986e98d469210395b79be584cd90
BLAKE2b-256 a2b5b0c5eb6d213e683aa8ead36325677dab01ca42ea4303f6df94ad72ac6550

See more details on using hashes here.

File details

Details for the file tokenizers-0.13.1-cp38-cp38-manylinux_2_17_s390x.manylinux2014_s390x.whl.

File metadata

File hashes

Hashes for tokenizers-0.13.1-cp38-cp38-manylinux_2_17_s390x.manylinux2014_s390x.whl
Algorithm Hash digest
SHA256 fd17b14f84bec0b171869abd17ca0d9bfc564aa1e7f3451f44da526949a911c1
MD5 4f4e23811fb7430120742b98050512d1
BLAKE2b-256 0ccc6cecf2a992791ff749b23df0a82d9c657d12f78f89409c10d414b48136da

See more details on using hashes here.

File details

Details for the file tokenizers-0.13.1-cp38-cp38-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl.

File metadata

File hashes

Hashes for tokenizers-0.13.1-cp38-cp38-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl
Algorithm Hash digest
SHA256 c3109ba62bea56c68c7c2a976250b040afee61b5f86fc791f17afaa2a09fce94
MD5 1d99b582c2198e0590040206b0e13a64
BLAKE2b-256 fec64cfc87f50c35d8e5c89bcc814c365b3e8f21ddf9c33a5bf9a548bacd2ce2

See more details on using hashes here.

File details

Details for the file tokenizers-0.13.1-cp38-cp38-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for tokenizers-0.13.1-cp38-cp38-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 126bcb18a77cf65961ece04f54bd10ef3713412195e543d9d3eda2f0e4fd677c
MD5 010a86940f6505e891e455559e74be39
BLAKE2b-256 39c8b8d95bf1022b8f34c72159e7e6f52207be71ca2cf88d164308e7219811f2

See more details on using hashes here.

File details

Details for the file tokenizers-0.13.1-cp38-cp38-macosx_10_11_x86_64.whl.

File metadata

File hashes

Hashes for tokenizers-0.13.1-cp38-cp38-macosx_10_11_x86_64.whl
Algorithm Hash digest
SHA256 8b3f97041f7716998e474d3c7ffd19ac6941f117616696aef2b5ba927bf091e3
MD5 1f6d756a9e3edcda25ad9630e138bb4a
BLAKE2b-256 4816e5eb09f9b628290b32bea573865fcc198ef6bfabfda89450a04f239a21b5

See more details on using hashes here.

File details

Details for the file tokenizers-0.13.1-cp37-cp37m-win_amd64.whl.

File metadata

File hashes

Hashes for tokenizers-0.13.1-cp37-cp37m-win_amd64.whl
Algorithm Hash digest
SHA256 16434c61c5eb72f6692b9bc52052b07ca92d3eba9dd72a1bc371890e1bdc3f07
MD5 85e7cdc0751c67a093aefa7b4a554347
BLAKE2b-256 3e305650b453912ba7c4f8e9282540b8966a6a9b04a105b083fc9c8eb3785b31

See more details on using hashes here.

File details

Details for the file tokenizers-0.13.1-cp37-cp37m-win32.whl.

File metadata

  • Download URL: tokenizers-0.13.1-cp37-cp37m-win32.whl
  • Upload date:
  • Size: 3.0 MB
  • Tags: CPython 3.7m, Windows x86
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/4.0.1 CPython/3.10.7

File hashes

Hashes for tokenizers-0.13.1-cp37-cp37m-win32.whl
Algorithm Hash digest
SHA256 3acf3cae4c4739fc9ec49fa0e6cce224c1aa0fabc8f8d1358fd7de5c7d49cdca
MD5 b804ec39fe43084956f404a2f74d0b0b
BLAKE2b-256 9c9ebf06b261f8d1ea03c5d5dc3689f376d5fedf1cc231c5b515a47a417f4ad7

See more details on using hashes here.

File details

Details for the file tokenizers-0.13.1-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.

File metadata

File hashes

Hashes for tokenizers-0.13.1-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Algorithm Hash digest
SHA256 db6872294339bf35c158219fc65bad939ba87d14c936ae7a33a3ca2d1532c5b1
MD5 21828664f131d51a352f7333b3f7a463
BLAKE2b-256 d8d95ce0c0c91fa0d6d879a2adb5db50d45ec8704147bc88e229c46d7c5ec410

See more details on using hashes here.

File details

Details for the file tokenizers-0.13.1-cp37-cp37m-manylinux_2_17_s390x.manylinux2014_s390x.whl.

File metadata

File hashes

Hashes for tokenizers-0.13.1-cp37-cp37m-manylinux_2_17_s390x.manylinux2014_s390x.whl
Algorithm Hash digest
SHA256 96a1beef1e64d44597627f4e29d794047a66ad4d7474d93daf5a0ee27928e012
MD5 e6b0d9b636794491ef7078e1fbaac6b5
BLAKE2b-256 aa53f00c2bd1e8f686c954d96a319fcebb80a9b69e7ef004de88b7471e3b1370

See more details on using hashes here.

File details

Details for the file tokenizers-0.13.1-cp37-cp37m-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl.

File metadata

File hashes

Hashes for tokenizers-0.13.1-cp37-cp37m-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl
Algorithm Hash digest
SHA256 80864f456f715829f901ad5bb85af97e9ae52fc902270944804f6476ab8c6699
MD5 22b89e3a1d2fbdc59871cdff91f9f05a
BLAKE2b-256 7e046040ac72e1b9a36753cc402a4d57d8f69070d568ad3de948f0f654b2078e

See more details on using hashes here.

File details

Details for the file tokenizers-0.13.1-cp37-cp37m-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for tokenizers-0.13.1-cp37-cp37m-manylinux_2_17_aarch64.manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 3c69a8389fd88bc32115e99db70f63bef577ba5c54f40a632580038a49612856
MD5 2365fe7b5fb449401eddef0e5b5b9e6e
BLAKE2b-256 a06a0cf58a76bffa7da336cacc562fc55c3dda4b4ebe4136e37ed72088511646

See more details on using hashes here.

File details

Details for the file tokenizers-0.13.1-cp37-cp37m-macosx_10_11_x86_64.whl.

File metadata

File hashes

Hashes for tokenizers-0.13.1-cp37-cp37m-macosx_10_11_x86_64.whl
Algorithm Hash digest
SHA256 4b3be8af87b357340b9b049d28067287b5e5e296e3120b6e4875d3b8469b67e6
MD5 7f3e609301e32cbca42bfdc65a5b3a73
BLAKE2b-256 6da97f8bf9875c1584bc8233a0a0e8d5f694c0a5ef1c59509030dbf34e938d8c

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page