Skip to main content

Fast and Customizable Tokenizers

Project description



Build GitHub


Tokenizers

Provides an implementation of today's most used tokenizers, with a focus on performance and versatility.

Bindings over the Rust implementation. If you are interested in the High-level design, you can go check it there.

Otherwise, let's dive in!

Main features:

  • Train new vocabularies and tokenize using 4 pre-made tokenizers (Bert WordPiece and the 3 most common BPE versions).
  • Extremely fast (both training and tokenization), thanks to the Rust implementation. Takes less than 20 seconds to tokenize a GB of text on a server's CPU.
  • Easy to use, but also extremely versatile.
  • Designed for research and production.
  • Normalization comes with alignments tracking. It's always possible to get the part of the original sentence that corresponds to a given token.
  • Does all the pre-processing: Truncate, Pad, add the special tokens your model needs.

Installation

With pip:

pip install tokenizers

From sources:

To use this method, you need to have the Rust installed:

# Install with:
curl https://sh.rustup.rs -sSf | sh -s -- -y
export PATH="$HOME/.cargo/bin:$PATH"

Once Rust is installed, you can compile doing the following

git clone https://github.com/huggingface/tokenizers
cd tokenizers/bindings/python

# Create a virtual env (you can use yours as well)
python -m venv .env
source .env/bin/activate

# Install `tokenizers` in the current virtual env
pip install setuptools_rust
python setup.py install

Using the provided Tokenizers

We provide some pre-build tokenizers to cover the most common cases. You can easily load one of these using some vocab.json and merges.txt files:

from tokenizers import CharBPETokenizer

# Initialize a tokenizer
vocab = "./path/to/vocab.json"
merges = "./path/to/merges.txt"
tokenizer = CharBPETokenizer(vocab, merges)

# And then encode:
encoded = tokenizer.encode("I can feel the magic, can you?")
print(encoded.ids)
print(encoded.tokens)

And you can train them just as simply:

from tokenizers import CharBPETokenizer

# Initialize a tokenizer
tokenizer = CharBPETokenizer()

# Then train it!
tokenizer.train([ "./path/to/files/1.txt", "./path/to/files/2.txt" ])

# Now, let's use it:
encoded = tokenizer.encode("I can feel the magic, can you?")

# And finally save it somewhere
tokenizer.save("./path/to/directory/my-bpe.tokenizer.json")

Provided Tokenizers

  • CharBPETokenizer: The original BPE
  • ByteLevelBPETokenizer: The byte level version of the BPE
  • SentencePieceBPETokenizer: A BPE implementation compatible with the one used by SentencePiece
  • BertWordPieceTokenizer: The famous Bert tokenizer, using WordPiece

All of these can be used and trained as explained above!

Build your own

Whenever these provided tokenizers don't give you enough freedom, you can build your own tokenizer, by putting all the different parts you need together. You can check how we implemented the provided tokenizers and adapt them easily to your own needs.

Building a byte-level BPE

Here is an example showing how to build your own byte-level BPE by putting all the different pieces together, and then saving it to a single file:

from tokenizers import Tokenizer, models, pre_tokenizers, decoders, trainers, processors

# Initialize a tokenizer
tokenizer = Tokenizer(models.BPE())

# Customize pre-tokenization and decoding
tokenizer.pre_tokenizer = pre_tokenizers.ByteLevel(add_prefix_space=True)
tokenizer.decoder = decoders.ByteLevel()
tokenizer.post_processor = processors.ByteLevel(trim_offsets=True)

# And then train
trainer = trainers.BpeTrainer(vocab_size=20000, min_frequency=2)
tokenizer.train([
	"./path/to/dataset/1.txt",
	"./path/to/dataset/2.txt",
	"./path/to/dataset/3.txt"
], trainer=trainer)

# And Save it
tokenizer.save("byte-level-bpe.tokenizer.json", pretty=True)

Now, when you want to use this tokenizer, this is as simple as:

from tokenizers import Tokenizer

tokenizer = Tokenizer.from_file("byte-level-bpe.tokenizer.json")

encoded = tokenizer.encode("I can feel the magic, can you?")

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tokenizers-0.10.1rc1.tar.gz (210.8 kB view details)

Uploaded Source

Built Distributions

tokenizers-0.10.1rc1-cp39-cp39-win_amd64.whl (2.0 MB view details)

Uploaded CPython 3.9 Windows x86-64

tokenizers-0.10.1rc1-cp39-cp39-win32.whl (1.8 MB view details)

Uploaded CPython 3.9 Windows x86

tokenizers-0.10.1rc1-cp39-cp39-manylinux2010_x86_64.whl (3.2 MB view details)

Uploaded CPython 3.9 manylinux: glibc 2.12+ x86-64

tokenizers-0.10.1rc1-cp39-cp39-macosx_10_11_x86_64.whl (2.2 MB view details)

Uploaded CPython 3.9 macOS 10.11+ x86-64

tokenizers-0.10.1rc1-cp38-cp38-win_amd64.whl (2.0 MB view details)

Uploaded CPython 3.8 Windows x86-64

tokenizers-0.10.1rc1-cp38-cp38-win32.whl (1.8 MB view details)

Uploaded CPython 3.8 Windows x86

tokenizers-0.10.1rc1-cp38-cp38-manylinux2010_x86_64.whl (3.2 MB view details)

Uploaded CPython 3.8 manylinux: glibc 2.12+ x86-64

tokenizers-0.10.1rc1-cp38-cp38-macosx_10_11_x86_64.whl (2.2 MB view details)

Uploaded CPython 3.8 macOS 10.11+ x86-64

tokenizers-0.10.1rc1-cp37-cp37m-win_amd64.whl (2.0 MB view details)

Uploaded CPython 3.7m Windows x86-64

tokenizers-0.10.1rc1-cp37-cp37m-win32.whl (1.8 MB view details)

Uploaded CPython 3.7m Windows x86

tokenizers-0.10.1rc1-cp37-cp37m-manylinux2010_x86_64.whl (3.2 MB view details)

Uploaded CPython 3.7m manylinux: glibc 2.12+ x86-64

tokenizers-0.10.1rc1-cp37-cp37m-macosx_10_11_x86_64.whl (2.2 MB view details)

Uploaded CPython 3.7m macOS 10.11+ x86-64

tokenizers-0.10.1rc1-cp36-cp36m-win_amd64.whl (2.0 MB view details)

Uploaded CPython 3.6m Windows x86-64

tokenizers-0.10.1rc1-cp36-cp36m-win32.whl (1.8 MB view details)

Uploaded CPython 3.6m Windows x86

tokenizers-0.10.1rc1-cp36-cp36m-manylinux2010_x86_64.whl (3.2 MB view details)

Uploaded CPython 3.6m manylinux: glibc 2.12+ x86-64

tokenizers-0.10.1rc1-cp36-cp36m-macosx_10_11_x86_64.whl (2.2 MB view details)

Uploaded CPython 3.6m macOS 10.11+ x86-64

tokenizers-0.10.1rc1-cp35-cp35m-win_amd64.whl (2.0 MB view details)

Uploaded CPython 3.5m Windows x86-64

tokenizers-0.10.1rc1-cp35-cp35m-win32.whl (1.8 MB view details)

Uploaded CPython 3.5m Windows x86

tokenizers-0.10.1rc1-cp35-cp35m-manylinux2010_x86_64.whl (3.2 MB view details)

Uploaded CPython 3.5m manylinux: glibc 2.12+ x86-64

tokenizers-0.10.1rc1-cp35-cp35m-macosx_10_11_x86_64.whl (2.2 MB view details)

Uploaded CPython 3.5m macOS 10.11+ x86-64

File details

Details for the file tokenizers-0.10.1rc1.tar.gz.

File metadata

  • Download URL: tokenizers-0.10.1rc1.tar.gz
  • Upload date:
  • Size: 210.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/49.2.1 requests-toolbelt/0.9.1 tqdm/4.56.0 CPython/3.9.1

File hashes

Hashes for tokenizers-0.10.1rc1.tar.gz
Algorithm Hash digest
SHA256 d664361a7ae484f05bd29396d030e672817ca705bb5617ccc11837d8fcb1690d
MD5 57944e0a2c4fc820956c793fc8818e80
BLAKE2b-256 014faa225bf6600870ffa08fe7d20ec041264dd50c41de6e7bae5ea295a21ad1

See more details on using hashes here.

File details

Details for the file tokenizers-0.10.1rc1-cp39-cp39-win_amd64.whl.

File metadata

  • Download URL: tokenizers-0.10.1rc1-cp39-cp39-win_amd64.whl
  • Upload date:
  • Size: 2.0 MB
  • Tags: CPython 3.9, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/49.2.1 requests-toolbelt/0.9.1 tqdm/4.56.0 CPython/3.9.1

File hashes

Hashes for tokenizers-0.10.1rc1-cp39-cp39-win_amd64.whl
Algorithm Hash digest
SHA256 0e03fd11994f75c7f78c28010d9fc07ce7208018af4df2ddbf2dacc6be49769c
MD5 92f65f7ab16c69bb68a416f93d2b22a8
BLAKE2b-256 5d85db41074b7486c4b7c4324a4765456ff350396e9f569e808abe791e592fae

See more details on using hashes here.

File details

Details for the file tokenizers-0.10.1rc1-cp39-cp39-win32.whl.

File metadata

  • Download URL: tokenizers-0.10.1rc1-cp39-cp39-win32.whl
  • Upload date:
  • Size: 1.8 MB
  • Tags: CPython 3.9, Windows x86
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/49.2.1 requests-toolbelt/0.9.1 tqdm/4.56.0 CPython/3.9.1

File hashes

Hashes for tokenizers-0.10.1rc1-cp39-cp39-win32.whl
Algorithm Hash digest
SHA256 0ba7c54f5c4755a5c5b7d37267bcce886ca636b11cb58a8b67aff199d49e51a5
MD5 37314f8f711970a03f25c8d9cc2e2d90
BLAKE2b-256 cb164a1eb7a04296e5d9affb5636abae50c7315c3f23b3e78e285b5309feaa39

See more details on using hashes here.

File details

Details for the file tokenizers-0.10.1rc1-cp39-cp39-manylinux2014_s390x.whl.

File metadata

  • Download URL: tokenizers-0.10.1rc1-cp39-cp39-manylinux2014_s390x.whl
  • Upload date:
  • Size: 3.7 MB
  • Tags: CPython 3.9
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/49.2.1 requests-toolbelt/0.9.1 tqdm/4.56.0 CPython/3.9.1

File hashes

Hashes for tokenizers-0.10.1rc1-cp39-cp39-manylinux2014_s390x.whl
Algorithm Hash digest
SHA256 04c87fc2af1544485f47a9fd7de7fdbd4dc7bc9cc63e1710b670380459b443ce
MD5 92341a3c766632c5fb8263c376f34c76
BLAKE2b-256 ba06f0086cfa4e58cbb813f52d3b96e859af3d6f88ab6a47beb5bfd3557b3daa

See more details on using hashes here.

File details

Details for the file tokenizers-0.10.1rc1-cp39-cp39-manylinux2014_ppc64le.whl.

File metadata

  • Download URL: tokenizers-0.10.1rc1-cp39-cp39-manylinux2014_ppc64le.whl
  • Upload date:
  • Size: 3.6 MB
  • Tags: CPython 3.9
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/49.2.1 requests-toolbelt/0.9.1 tqdm/4.56.0 CPython/3.9.1

File hashes

Hashes for tokenizers-0.10.1rc1-cp39-cp39-manylinux2014_ppc64le.whl
Algorithm Hash digest
SHA256 387fb59d61e7fdefd378996656170e82714700b3409de67342a7f32f93aa03c2
MD5 572ba3cc5f48eb22441233edbc4cf5da
BLAKE2b-256 8833ebf9adaadfcd9aa6501c51f24f4341c69743cf894e68b0390d724018bce1

See more details on using hashes here.

File details

Details for the file tokenizers-0.10.1rc1-cp39-cp39-manylinux2014_aarch64.whl.

File metadata

  • Download URL: tokenizers-0.10.1rc1-cp39-cp39-manylinux2014_aarch64.whl
  • Upload date:
  • Size: 3.1 MB
  • Tags: CPython 3.9
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/49.2.1 requests-toolbelt/0.9.1 tqdm/4.56.0 CPython/3.9.1

File hashes

Hashes for tokenizers-0.10.1rc1-cp39-cp39-manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 1a3dd8b3e885dc25c21844fd59556aea892347a13c915c60aff7d98ecf018356
MD5 099224a56089b546be84854cb131ee8c
BLAKE2b-256 458bdc8919e4a414354b6e9ed574be803af1cfcc7649a94424060767faed5cdf

See more details on using hashes here.

File details

Details for the file tokenizers-0.10.1rc1-cp39-cp39-manylinux2010_x86_64.whl.

File metadata

  • Download URL: tokenizers-0.10.1rc1-cp39-cp39-manylinux2010_x86_64.whl
  • Upload date:
  • Size: 3.2 MB
  • Tags: CPython 3.9, manylinux: glibc 2.12+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/49.2.1 requests-toolbelt/0.9.1 tqdm/4.56.0 CPython/3.9.1

File hashes

Hashes for tokenizers-0.10.1rc1-cp39-cp39-manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 7fa1ec849b6e710d96d6bf9008d01ca11dae9e9338467bd0df11a328738056f4
MD5 150b6495ab81d20438b0da7d891d5b54
BLAKE2b-256 fe69fc35b26b650af1c9aff083a119de0f290bf5817837b0176034b47bacca08

See more details on using hashes here.

File details

Details for the file tokenizers-0.10.1rc1-cp39-cp39-manylinux1_x86_64.whl.

File metadata

  • Download URL: tokenizers-0.10.1rc1-cp39-cp39-manylinux1_x86_64.whl
  • Upload date:
  • Size: 3.2 MB
  • Tags: CPython 3.9
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/49.2.1 requests-toolbelt/0.9.1 tqdm/4.56.0 CPython/3.9.1

File hashes

Hashes for tokenizers-0.10.1rc1-cp39-cp39-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 7f440214548bccf3298db6a0eb777e6e25e6e0a2456959feb6bde0f168e528a7
MD5 d513602842f8ec96cf658ec6c179009f
BLAKE2b-256 d3d9b6da8f1b7b2eb03c7818ff8c0e05b442917e23d55f38c1e1a6471a9fab7d

See more details on using hashes here.

File details

Details for the file tokenizers-0.10.1rc1-cp39-cp39-macosx_10_11_x86_64.whl.

File metadata

  • Download URL: tokenizers-0.10.1rc1-cp39-cp39-macosx_10_11_x86_64.whl
  • Upload date:
  • Size: 2.2 MB
  • Tags: CPython 3.9, macOS 10.11+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/49.2.1 requests-toolbelt/0.9.1 tqdm/4.56.0 CPython/3.9.1

File hashes

Hashes for tokenizers-0.10.1rc1-cp39-cp39-macosx_10_11_x86_64.whl
Algorithm Hash digest
SHA256 7cb0d8f5b41b5b464f34a3f4576e76a05070fa8f62c745c9658da4cdcd9c9a68
MD5 39bb4ed00bfd881645019b5bf8124db3
BLAKE2b-256 00a205cd4322ac196485389394d1dd7dbfa0de04543e57edb9b2f50f5fa29546

See more details on using hashes here.

File details

Details for the file tokenizers-0.10.1rc1-cp38-cp38-win_amd64.whl.

File metadata

  • Download URL: tokenizers-0.10.1rc1-cp38-cp38-win_amd64.whl
  • Upload date:
  • Size: 2.0 MB
  • Tags: CPython 3.8, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/49.2.1 requests-toolbelt/0.9.1 tqdm/4.56.0 CPython/3.9.1

File hashes

Hashes for tokenizers-0.10.1rc1-cp38-cp38-win_amd64.whl
Algorithm Hash digest
SHA256 7b74c5f6426d24b55358fe6f4d6f976ccff8790c29c488e8657ca27dc7e9491f
MD5 4ee9dafb003b1deaf6dc4d30b90719e6
BLAKE2b-256 4926965561a57fbe0a8d650d5aa40ee4e5a09aafd394d9d73575e939704c6a96

See more details on using hashes here.

File details

Details for the file tokenizers-0.10.1rc1-cp38-cp38-win32.whl.

File metadata

  • Download URL: tokenizers-0.10.1rc1-cp38-cp38-win32.whl
  • Upload date:
  • Size: 1.8 MB
  • Tags: CPython 3.8, Windows x86
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/49.2.1 requests-toolbelt/0.9.1 tqdm/4.56.0 CPython/3.9.1

File hashes

Hashes for tokenizers-0.10.1rc1-cp38-cp38-win32.whl
Algorithm Hash digest
SHA256 95d157d80ad1ca35b992dc1de54268f3cac69af260a5484b854688f74f05f5d4
MD5 6940599d022dc078dc2c34995d338c8c
BLAKE2b-256 73fef3c0bb34d35382fe5ab763771e41e03409a5d33966a8cb7c2a162324f805

See more details on using hashes here.

File details

Details for the file tokenizers-0.10.1rc1-cp38-cp38-manylinux2014_s390x.whl.

File metadata

  • Download URL: tokenizers-0.10.1rc1-cp38-cp38-manylinux2014_s390x.whl
  • Upload date:
  • Size: 3.7 MB
  • Tags: CPython 3.8
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/49.2.1 requests-toolbelt/0.9.1 tqdm/4.56.0 CPython/3.9.1

File hashes

Hashes for tokenizers-0.10.1rc1-cp38-cp38-manylinux2014_s390x.whl
Algorithm Hash digest
SHA256 f64af246442fcb4bdd317e1e94eed4ffd8b11322638682efab28f04d8137a009
MD5 d90f557abd5896c1b91edd7aba24d560
BLAKE2b-256 da4f2c27c1c3558f91bab93030443daedbf0d0993b6d4e097b3216a86534117c

See more details on using hashes here.

File details

Details for the file tokenizers-0.10.1rc1-cp38-cp38-manylinux2014_ppc64le.whl.

File metadata

  • Download URL: tokenizers-0.10.1rc1-cp38-cp38-manylinux2014_ppc64le.whl
  • Upload date:
  • Size: 3.6 MB
  • Tags: CPython 3.8
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/49.2.1 requests-toolbelt/0.9.1 tqdm/4.56.0 CPython/3.9.1

File hashes

Hashes for tokenizers-0.10.1rc1-cp38-cp38-manylinux2014_ppc64le.whl
Algorithm Hash digest
SHA256 7ca174f314d84b819a058704e50be6c573fcc9338c139e5721d067105ae35901
MD5 bfab59fca1d7cae14ccbb3e14559d007
BLAKE2b-256 5438ffe00694b20be8810451c6642db218ddb5aa7d106cc37a083bec926d9623

See more details on using hashes here.

File details

Details for the file tokenizers-0.10.1rc1-cp38-cp38-manylinux2014_aarch64.whl.

File metadata

  • Download URL: tokenizers-0.10.1rc1-cp38-cp38-manylinux2014_aarch64.whl
  • Upload date:
  • Size: 3.1 MB
  • Tags: CPython 3.8
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/49.2.1 requests-toolbelt/0.9.1 tqdm/4.56.0 CPython/3.9.1

File hashes

Hashes for tokenizers-0.10.1rc1-cp38-cp38-manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 cbeb1e9178d5fc1e7b6148c51eca0fd6668b59b6b7a3e9da589edfe2dc861953
MD5 4ab7adf43055a944c037d2e274e63c93
BLAKE2b-256 a6a56fed1de91ea631155037e8eed4b807decdf039bb65d215f498a01d559627

See more details on using hashes here.

File details

Details for the file tokenizers-0.10.1rc1-cp38-cp38-manylinux2010_x86_64.whl.

File metadata

  • Download URL: tokenizers-0.10.1rc1-cp38-cp38-manylinux2010_x86_64.whl
  • Upload date:
  • Size: 3.2 MB
  • Tags: CPython 3.8, manylinux: glibc 2.12+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/49.2.1 requests-toolbelt/0.9.1 tqdm/4.56.0 CPython/3.9.1

File hashes

Hashes for tokenizers-0.10.1rc1-cp38-cp38-manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 de5d286910851ec3268b00b8cac18bcc8fecf700810fb6d9b6adff4a9eb5a000
MD5 10c2ae9b8bc0cd7cbf71b59f92721c14
BLAKE2b-256 1ac9e0300e618987f72f0a0282dc9976bb0ccc5dd31983f9624d2ee2948b78fa

See more details on using hashes here.

File details

Details for the file tokenizers-0.10.1rc1-cp38-cp38-manylinux1_x86_64.whl.

File metadata

  • Download URL: tokenizers-0.10.1rc1-cp38-cp38-manylinux1_x86_64.whl
  • Upload date:
  • Size: 3.2 MB
  • Tags: CPython 3.8
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/49.2.1 requests-toolbelt/0.9.1 tqdm/4.56.0 CPython/3.9.1

File hashes

Hashes for tokenizers-0.10.1rc1-cp38-cp38-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 3db6dbcaae93fb612a94ef2ddcf5dcb2e409a56d83b34300ab36c2694fdc1571
MD5 79ab742ae3e68e8da7f11d53c44cf39f
BLAKE2b-256 162c097d194ca657f50fca0c3b97d66d73b06d38f783bcdc86116d0192232ac6

See more details on using hashes here.

File details

Details for the file tokenizers-0.10.1rc1-cp38-cp38-macosx_10_11_x86_64.whl.

File metadata

  • Download URL: tokenizers-0.10.1rc1-cp38-cp38-macosx_10_11_x86_64.whl
  • Upload date:
  • Size: 2.2 MB
  • Tags: CPython 3.8, macOS 10.11+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/49.2.1 requests-toolbelt/0.9.1 tqdm/4.56.0 CPython/3.9.1

File hashes

Hashes for tokenizers-0.10.1rc1-cp38-cp38-macosx_10_11_x86_64.whl
Algorithm Hash digest
SHA256 84f6ff975ea9b832978e7ea2608432f64a1f6fe5a714247904beaeec8117d146
MD5 969e9a5124fede0504fc57030a970971
BLAKE2b-256 d468332bbfd4b3fc396be9fb6ea0d9937a0a5e753a61473ff2d0d8a1f8cd8fc1

See more details on using hashes here.

File details

Details for the file tokenizers-0.10.1rc1-cp37-cp37m-win_amd64.whl.

File metadata

  • Download URL: tokenizers-0.10.1rc1-cp37-cp37m-win_amd64.whl
  • Upload date:
  • Size: 2.0 MB
  • Tags: CPython 3.7m, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/49.2.1 requests-toolbelt/0.9.1 tqdm/4.56.0 CPython/3.9.1

File hashes

Hashes for tokenizers-0.10.1rc1-cp37-cp37m-win_amd64.whl
Algorithm Hash digest
SHA256 f8c5c184c6f0cff01d99baf66339dd0fb5683cf695358f9fb45be2e80da7a4fc
MD5 52cbf4db7cbbbc616b3f11cfb6792ca4
BLAKE2b-256 a627119591b8515ac4fbd2633b45e596528a03c6941f29d8ac556e3c364fd980

See more details on using hashes here.

File details

Details for the file tokenizers-0.10.1rc1-cp37-cp37m-win32.whl.

File metadata

  • Download URL: tokenizers-0.10.1rc1-cp37-cp37m-win32.whl
  • Upload date:
  • Size: 1.8 MB
  • Tags: CPython 3.7m, Windows x86
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/49.2.1 requests-toolbelt/0.9.1 tqdm/4.56.0 CPython/3.9.1

File hashes

Hashes for tokenizers-0.10.1rc1-cp37-cp37m-win32.whl
Algorithm Hash digest
SHA256 f58588d4cff059f3593affa02f48d9942e62e1d406f3b6446d3716fd007e88ee
MD5 036f16e724cdb34e1b75b2a1a5a027f4
BLAKE2b-256 caacbaa88a968c58dfd2f928b78565aeab692eb85dc6bf3b30002d5fed301ce4

See more details on using hashes here.

File details

Details for the file tokenizers-0.10.1rc1-cp37-cp37m-manylinux2014_s390x.whl.

File metadata

  • Download URL: tokenizers-0.10.1rc1-cp37-cp37m-manylinux2014_s390x.whl
  • Upload date:
  • Size: 3.7 MB
  • Tags: CPython 3.7m
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/49.2.1 requests-toolbelt/0.9.1 tqdm/4.56.0 CPython/3.9.1

File hashes

Hashes for tokenizers-0.10.1rc1-cp37-cp37m-manylinux2014_s390x.whl
Algorithm Hash digest
SHA256 4cce220abc7f6ff76430e49198aa5af64bf3d53e2d0bf182c3475f4eb48fa887
MD5 22d15eb3405e11ac7cb12a81d825e06f
BLAKE2b-256 6588a40dd18159cff5dae60302a14e0bb13911c721732a5c22152b735280095b

See more details on using hashes here.

File details

Details for the file tokenizers-0.10.1rc1-cp37-cp37m-manylinux2014_ppc64le.whl.

File metadata

File hashes

Hashes for tokenizers-0.10.1rc1-cp37-cp37m-manylinux2014_ppc64le.whl
Algorithm Hash digest
SHA256 2345fa81f525ee5cfeac8192071f579a46e544017186569d3940ba3aef80bb0f
MD5 8f4a6a05e304af3149c70d0c7de6afe8
BLAKE2b-256 fe1843acc12d1724344d3d283f94961ac01d3e386ed6ff9068cf337e3cafbc22

See more details on using hashes here.

File details

Details for the file tokenizers-0.10.1rc1-cp37-cp37m-manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for tokenizers-0.10.1rc1-cp37-cp37m-manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 8250191165889e3b75b25e94e039719a06710e6452aaba50888b203a3bd731d1
MD5 433e1d18ca6c6f89630132125dd2cacd
BLAKE2b-256 0624942c1b59b42351554556017e2b5727367b5b1c113eaf7a5378a01cd955a1

See more details on using hashes here.

File details

Details for the file tokenizers-0.10.1rc1-cp37-cp37m-manylinux2010_x86_64.whl.

File metadata

  • Download URL: tokenizers-0.10.1rc1-cp37-cp37m-manylinux2010_x86_64.whl
  • Upload date:
  • Size: 3.2 MB
  • Tags: CPython 3.7m, manylinux: glibc 2.12+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/49.2.1 requests-toolbelt/0.9.1 tqdm/4.56.0 CPython/3.9.1

File hashes

Hashes for tokenizers-0.10.1rc1-cp37-cp37m-manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 c0585f9bb952aa1afa0e8ae7ffa571ef921bf55ce7edb8a53af5104a6c132f13
MD5 a3e327a68fe70b558eedfbc94466c265
BLAKE2b-256 4b723267f5946db7945bb50650b64abb09376ae15a872f4abe30c5812ff0a750

See more details on using hashes here.

File details

Details for the file tokenizers-0.10.1rc1-cp37-cp37m-manylinux1_x86_64.whl.

File metadata

  • Download URL: tokenizers-0.10.1rc1-cp37-cp37m-manylinux1_x86_64.whl
  • Upload date:
  • Size: 3.2 MB
  • Tags: CPython 3.7m
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/49.2.1 requests-toolbelt/0.9.1 tqdm/4.56.0 CPython/3.9.1

File hashes

Hashes for tokenizers-0.10.1rc1-cp37-cp37m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 70e3b0895f43fca0e562f4c2f4b3cf3d5ade5d767825d52e74c9d0978c3dd98f
MD5 cf47c87ac4b50525a8f4542dd01d3410
BLAKE2b-256 ee18f3cd1474181132a9894c59f1988f1ffee36df3f5f8072c3735f6bc2240b1

See more details on using hashes here.

File details

Details for the file tokenizers-0.10.1rc1-cp37-cp37m-macosx_10_11_x86_64.whl.

File metadata

  • Download URL: tokenizers-0.10.1rc1-cp37-cp37m-macosx_10_11_x86_64.whl
  • Upload date:
  • Size: 2.2 MB
  • Tags: CPython 3.7m, macOS 10.11+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/49.2.1 requests-toolbelt/0.9.1 tqdm/4.56.0 CPython/3.9.1

File hashes

Hashes for tokenizers-0.10.1rc1-cp37-cp37m-macosx_10_11_x86_64.whl
Algorithm Hash digest
SHA256 b9406db0e1274371b0903360152f69bb5306b623f43587af6b12ff820c5f3ac8
MD5 a8211d20ac773bcb6953eb47d0fc4c48
BLAKE2b-256 57b2c394e80f70a0675a7cdb6dc0db764c2456a6251486a54c9602d2ae7ebdb2

See more details on using hashes here.

File details

Details for the file tokenizers-0.10.1rc1-cp36-cp36m-win_amd64.whl.

File metadata

  • Download URL: tokenizers-0.10.1rc1-cp36-cp36m-win_amd64.whl
  • Upload date:
  • Size: 2.0 MB
  • Tags: CPython 3.6m, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/49.2.1 requests-toolbelt/0.9.1 tqdm/4.56.0 CPython/3.9.1

File hashes

Hashes for tokenizers-0.10.1rc1-cp36-cp36m-win_amd64.whl
Algorithm Hash digest
SHA256 573b49427bb57fc1e806d2d0af7ef6e2a2703b2039fcfef12a2d014c3df9569f
MD5 80094bc9b0320ddafae72565f3c19a40
BLAKE2b-256 4ec2e66c73c5efc36cead92b44a77e72972e19a8a562161cd4a5b38c36f84bb1

See more details on using hashes here.

File details

Details for the file tokenizers-0.10.1rc1-cp36-cp36m-win32.whl.

File metadata

  • Download URL: tokenizers-0.10.1rc1-cp36-cp36m-win32.whl
  • Upload date:
  • Size: 1.8 MB
  • Tags: CPython 3.6m, Windows x86
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/49.2.1 requests-toolbelt/0.9.1 tqdm/4.56.0 CPython/3.9.1

File hashes

Hashes for tokenizers-0.10.1rc1-cp36-cp36m-win32.whl
Algorithm Hash digest
SHA256 e7ab4ee23c0343cf3a0cd6d779aaeb056b4b587bddaae1222b0d601f5be5a451
MD5 a7ad297212905361bc54f3fd1009f69c
BLAKE2b-256 15dab2bc01d7e5fe14badb9c945a0e1828c77a3f641a923982a38df2a453d2f4

See more details on using hashes here.

File details

Details for the file tokenizers-0.10.1rc1-cp36-cp36m-manylinux2014_s390x.whl.

File metadata

  • Download URL: tokenizers-0.10.1rc1-cp36-cp36m-manylinux2014_s390x.whl
  • Upload date:
  • Size: 3.7 MB
  • Tags: CPython 3.6m
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/49.2.1 requests-toolbelt/0.9.1 tqdm/4.56.0 CPython/3.9.1

File hashes

Hashes for tokenizers-0.10.1rc1-cp36-cp36m-manylinux2014_s390x.whl
Algorithm Hash digest
SHA256 1a961c02a00d58ff0b3cdefa959126ac942de97fc9f7498caf7dae5e739359f6
MD5 09abbe88b6a272f454f6b3d68cc3407d
BLAKE2b-256 62101887806b01a982a8513cab2f9eed5f445f793d922339da9df386d1e7b79b

See more details on using hashes here.

File details

Details for the file tokenizers-0.10.1rc1-cp36-cp36m-manylinux2014_ppc64le.whl.

File metadata

File hashes

Hashes for tokenizers-0.10.1rc1-cp36-cp36m-manylinux2014_ppc64le.whl
Algorithm Hash digest
SHA256 b1f76f7a9f6863d0773e6fed8584dfdb14ea2095ae5aeb2906cacce061aba39b
MD5 0ea4efaed6f9e569f23be1cd22586978
BLAKE2b-256 9f438398ed8258f763eb60857a2ab0034f7604e0dc81dd6d3c589d4a9d5fbcd5

See more details on using hashes here.

File details

Details for the file tokenizers-0.10.1rc1-cp36-cp36m-manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for tokenizers-0.10.1rc1-cp36-cp36m-manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 9a284186c9fa095fb73535793cdc498f012d1119d323d763144a524971772ae6
MD5 9ac167b6f146fa50a2e5e9c201f75d3b
BLAKE2b-256 5ae5bdf3d749a1e4255685957b232823ee5814b0ecd5da55b91906a1f5937da1

See more details on using hashes here.

File details

Details for the file tokenizers-0.10.1rc1-cp36-cp36m-manylinux2010_x86_64.whl.

File metadata

  • Download URL: tokenizers-0.10.1rc1-cp36-cp36m-manylinux2010_x86_64.whl
  • Upload date:
  • Size: 3.2 MB
  • Tags: CPython 3.6m, manylinux: glibc 2.12+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/49.2.1 requests-toolbelt/0.9.1 tqdm/4.56.0 CPython/3.9.1

File hashes

Hashes for tokenizers-0.10.1rc1-cp36-cp36m-manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 43b768eee002c4e33c87b256028d1e186d295126e3a1692bb872e6324f1987b0
MD5 824eb423e607b581f7d386292ba5abe8
BLAKE2b-256 3897984c9af3f145c2009abdb6dca2cd85cc48f7f7383dcf7d29e74a99fbdd64

See more details on using hashes here.

File details

Details for the file tokenizers-0.10.1rc1-cp36-cp36m-manylinux1_x86_64.whl.

File metadata

  • Download URL: tokenizers-0.10.1rc1-cp36-cp36m-manylinux1_x86_64.whl
  • Upload date:
  • Size: 3.2 MB
  • Tags: CPython 3.6m
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/49.2.1 requests-toolbelt/0.9.1 tqdm/4.56.0 CPython/3.9.1

File hashes

Hashes for tokenizers-0.10.1rc1-cp36-cp36m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 9e8a99a0e2ded33f184dbabe0299b5ebd8831b6a64e4be2cbe127c7af45378e2
MD5 443a29cb0c9e370eaa6e94daa91252d2
BLAKE2b-256 1e75d71a9f43fbde28b7af91f1bf59261b6333e8d351252ee60010c50d315dfd

See more details on using hashes here.

File details

Details for the file tokenizers-0.10.1rc1-cp36-cp36m-macosx_10_11_x86_64.whl.

File metadata

  • Download URL: tokenizers-0.10.1rc1-cp36-cp36m-macosx_10_11_x86_64.whl
  • Upload date:
  • Size: 2.2 MB
  • Tags: CPython 3.6m, macOS 10.11+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/49.2.1 requests-toolbelt/0.9.1 tqdm/4.56.0 CPython/3.9.1

File hashes

Hashes for tokenizers-0.10.1rc1-cp36-cp36m-macosx_10_11_x86_64.whl
Algorithm Hash digest
SHA256 8aa3ca15f9ea9b1ced1df23a2a43088dd858b3eb68da284e653064154de63251
MD5 fc0c6364c25c46dfd27409d5902f1d25
BLAKE2b-256 c75232f8097bcf9f783874cc43a58c67eaf65cfe97199c283f32cfbf2324ac30

See more details on using hashes here.

File details

Details for the file tokenizers-0.10.1rc1-cp35-cp35m-win_amd64.whl.

File metadata

  • Download URL: tokenizers-0.10.1rc1-cp35-cp35m-win_amd64.whl
  • Upload date:
  • Size: 2.0 MB
  • Tags: CPython 3.5m, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/49.2.1 requests-toolbelt/0.9.1 tqdm/4.56.0 CPython/3.9.1

File hashes

Hashes for tokenizers-0.10.1rc1-cp35-cp35m-win_amd64.whl
Algorithm Hash digest
SHA256 e6f0ae3dcffe97d630a788b627edb0a69c25ba94dc1c7f437374f1b6509ca516
MD5 39cc9b865eb086807b2c1238ed65bc86
BLAKE2b-256 e63baa77d102b13e003e92c40f912615d4e6f3b46fcfe639d2bd2bd335930cd1

See more details on using hashes here.

File details

Details for the file tokenizers-0.10.1rc1-cp35-cp35m-win32.whl.

File metadata

  • Download URL: tokenizers-0.10.1rc1-cp35-cp35m-win32.whl
  • Upload date:
  • Size: 1.8 MB
  • Tags: CPython 3.5m, Windows x86
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/49.2.1 requests-toolbelt/0.9.1 tqdm/4.56.0 CPython/3.9.1

File hashes

Hashes for tokenizers-0.10.1rc1-cp35-cp35m-win32.whl
Algorithm Hash digest
SHA256 e5fca658c232da4d21c7b3de6a5e77c7f3cb3c865e7cd2106c196732b2a0d255
MD5 d5cd5c1cf62ea7b1e300f00b79a3c401
BLAKE2b-256 8f46c4e6cd5a063ab3c7b923b4cdd6d7a6d6bd653efa0e2acc75d845f82bf580

See more details on using hashes here.

File details

Details for the file tokenizers-0.10.1rc1-cp35-cp35m-manylinux2014_s390x.whl.

File metadata

  • Download URL: tokenizers-0.10.1rc1-cp35-cp35m-manylinux2014_s390x.whl
  • Upload date:
  • Size: 3.7 MB
  • Tags: CPython 3.5m
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/49.2.1 requests-toolbelt/0.9.1 tqdm/4.56.0 CPython/3.9.1

File hashes

Hashes for tokenizers-0.10.1rc1-cp35-cp35m-manylinux2014_s390x.whl
Algorithm Hash digest
SHA256 7f0cebccd9d98f1546ea997325ec6ffffadd18286325111c662f525da122c891
MD5 2b4dd7ae35e28e74c9649dd72e1525e7
BLAKE2b-256 62dc0fa4df708bdfbab84584e2cb3c2d06083f41b93e9e0647dfad214563de0d

See more details on using hashes here.

File details

Details for the file tokenizers-0.10.1rc1-cp35-cp35m-manylinux2014_ppc64le.whl.

File metadata

File hashes

Hashes for tokenizers-0.10.1rc1-cp35-cp35m-manylinux2014_ppc64le.whl
Algorithm Hash digest
SHA256 a0e301ecf86be66026e2426a12f85da5cca8a6050e2cdbc9655078510fcdef91
MD5 0a452e60ce4116103f7e37b8301c86eb
BLAKE2b-256 ccbf5560e7009d331a6b06414b29f54decbfcc80f6c99ccfd4a55f3a8e5c3981

See more details on using hashes here.

File details

Details for the file tokenizers-0.10.1rc1-cp35-cp35m-manylinux2014_aarch64.whl.

File metadata

File hashes

Hashes for tokenizers-0.10.1rc1-cp35-cp35m-manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 8f0514539e0a3dcd657740f80e7252115b095544657b367344e3d60598610f93
MD5 2c29888ae78d40b9fbd9f72213e30fe4
BLAKE2b-256 867c320a13c549fec310ad875092af4e19e103016a0fe27bebc21a3eda491864

See more details on using hashes here.

File details

Details for the file tokenizers-0.10.1rc1-cp35-cp35m-manylinux2010_x86_64.whl.

File metadata

  • Download URL: tokenizers-0.10.1rc1-cp35-cp35m-manylinux2010_x86_64.whl
  • Upload date:
  • Size: 3.2 MB
  • Tags: CPython 3.5m, manylinux: glibc 2.12+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/49.2.1 requests-toolbelt/0.9.1 tqdm/4.56.0 CPython/3.9.1

File hashes

Hashes for tokenizers-0.10.1rc1-cp35-cp35m-manylinux2010_x86_64.whl
Algorithm Hash digest
SHA256 f9a115edc4576ac765e6144c62a5243d9f1c983e7422f71bf7ddb493707c4a87
MD5 47cacd75898eb3e9d492a25edd882de0
BLAKE2b-256 8d3ea1a62c12d504dab16e4e4d77132ab6316006a4b674556d210b770b1e3447

See more details on using hashes here.

File details

Details for the file tokenizers-0.10.1rc1-cp35-cp35m-manylinux1_x86_64.whl.

File metadata

  • Download URL: tokenizers-0.10.1rc1-cp35-cp35m-manylinux1_x86_64.whl
  • Upload date:
  • Size: 3.2 MB
  • Tags: CPython 3.5m
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/49.2.1 requests-toolbelt/0.9.1 tqdm/4.56.0 CPython/3.9.1

File hashes

Hashes for tokenizers-0.10.1rc1-cp35-cp35m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 e04825d96e4b761fe8f133bd5a410fd2c172393acc706aee9f6fafaa664e0794
MD5 45179db089b455212d2ef7c5a856d11e
BLAKE2b-256 6d356c0f6b1da3a16595ad441862a7dd8f54434fb5972f3fc5e40f4a1db35f26

See more details on using hashes here.

File details

Details for the file tokenizers-0.10.1rc1-cp35-cp35m-macosx_10_11_x86_64.whl.

File metadata

  • Download URL: tokenizers-0.10.1rc1-cp35-cp35m-macosx_10_11_x86_64.whl
  • Upload date:
  • Size: 2.2 MB
  • Tags: CPython 3.5m, macOS 10.11+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.3.0 pkginfo/1.7.0 requests/2.25.1 setuptools/49.2.1 requests-toolbelt/0.9.1 tqdm/4.56.0 CPython/3.9.1

File hashes

Hashes for tokenizers-0.10.1rc1-cp35-cp35m-macosx_10_11_x86_64.whl
Algorithm Hash digest
SHA256 6a665057bc49c11b6bdc7610a02e5f900acad5c7c3d3bab802dd9e36be23370a
MD5 0500d19966ca06bd82e8a3beddb4e1b7
BLAKE2b-256 3bdc8316657f1562b20753393cdf7e3066c5e85cbfe6524d5df23a6303e86a05

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page