Skip to main content

Fast and Customizable Tokenizers

Project description



Build GitHub


Tokenizers

Provides an implementation of today's most used tokenizers, with a focus on performance and versatility.

Bindings over the Rust implementation. If you are interested in the High-level design, you can go check it there.

Otherwise, let's dive in!

Main features:

  • Train new vocabularies and tokenize using 4 pre-made tokenizers (Bert WordPiece and the 3 most common BPE versions).
  • Extremely fast (both training and tokenization), thanks to the Rust implementation. Takes less than 20 seconds to tokenize a GB of text on a server's CPU.
  • Easy to use, but also extremely versatile.
  • Designed for research and production.
  • Normalization comes with alignments tracking. It's always possible to get the part of the original sentence that corresponds to a given token.
  • Does all the pre-processing: Truncate, Pad, add the special tokens your model needs.

Installation

With pip:

pip install tokenizers

From sources:

To use this method, you need to have the Rust installed:

# Install with:
curl https://sh.rustup.rs -sSf | sh -s -- -y
export PATH="$HOME/.cargo/bin:$PATH"

Once Rust is installed, you can compile doing the following

git clone https://github.com/huggingface/tokenizers
cd tokenizers/bindings/python

# Create a virtual env (you can use yours as well)
python -m venv .env
source .env/bin/activate

# Install `tokenizers` in the current virtual env
pip install setuptools_rust
python setup.py install

Using the provided Tokenizers

We provide some pre-build tokenizers to cover the most common cases. You can easily load one of these using some vocab.json and merges.txt files:

from tokenizers import CharBPETokenizer

# Initialize a tokenizer
vocab = "./path/to/vocab.json"
merges = "./path/to/merges.txt"
tokenizer = CharBPETokenizer(vocab, merges)

# And then encode:
encoded = tokenizer.encode("I can feel the magic, can you?")
print(encoded.ids)
print(encoded.tokens)

And you can train them just as simply:

from tokenizers import CharBPETokenizer

# Initialize a tokenizer
tokenizer = CharBPETokenizer()

# Then train it!
tokenizer.train([ "./path/to/files/1.txt", "./path/to/files/2.txt" ])

# Now, let's use it:
encoded = tokenizer.encode("I can feel the magic, can you?")

# And finally save it somewhere
tokenizer.save("./path/to/directory/my-bpe.tokenizer.json")

Provided Tokenizers

  • CharBPETokenizer: The original BPE
  • ByteLevelBPETokenizer: The byte level version of the BPE
  • SentencePieceBPETokenizer: A BPE implementation compatible with the one used by SentencePiece
  • BertWordPieceTokenizer: The famous Bert tokenizer, using WordPiece

All of these can be used and trained as explained above!

Build your own

Whenever these provided tokenizers don't give you enough freedom, you can build your own tokenizer, by putting all the different parts you need together. You can how we implemented the provided tokenizers and adapt them easily to your own needs.

Building a byte-level BPE

Here is an example showing how to build your own byte-level BPE by putting all the different pieces together, and then saving it to a single file:

from tokenizers import Tokenizer, models, pre_tokenizers, decoders, trainers, processors

# Initialize a tokenizer
tokenizer = Tokenizer(models.BPE())

# Customize pre-tokenization and decoding
tokenizer.pre_tokenizer = pre_tokenizers.ByteLevel(add_prefix_space=True)
tokenizer.decoder = decoders.ByteLevel()
tokenizer.post_processor = processors.ByteLevel(trim_offsets=True)

# And then train
trainer = trainers.BpeTrainer(vocab_size=20000, min_frequency=2)
tokenizer.train(trainer, [
	"./path/to/dataset/1.txt",
	"./path/to/dataset/2.txt",
	"./path/to/dataset/3.txt"
])

# And Save it
tokenizer.save("byte-level-bpe.tokenizer.json", pretty=True)

Now, when you want to use this tokenizer, this is as simple as:

from tokenizers import Tokenizer

tokenizer = Tokenizer.from_file("byte-level-bpe.tokenizer.json")

encoded = tokenizer.encode("I can feel the magic, can you?")

Project details


Release history Release notifications | RSS feed

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

tokenizers-0.8.0rc4.tar.gz (97.0 kB view details)

Uploaded Source

Built Distributions

tokenizers-0.8.0rc4-cp38-cp38-win_amd64.whl (1.9 MB view details)

Uploaded CPython 3.8 Windows x86-64

tokenizers-0.8.0rc4-cp38-cp38-win32.whl (1.7 MB view details)

Uploaded CPython 3.8 Windows x86

tokenizers-0.8.0rc4-cp38-cp38-manylinux1_x86_64.whl (3.0 MB view details)

Uploaded CPython 3.8

tokenizers-0.8.0rc4-cp38-cp38-macosx_10_10_x86_64.whl (2.1 MB view details)

Uploaded CPython 3.8 macOS 10.10+ x86-64

tokenizers-0.8.0rc4-cp37-cp37m-win_amd64.whl (1.9 MB view details)

Uploaded CPython 3.7m Windows x86-64

tokenizers-0.8.0rc4-cp37-cp37m-win32.whl (1.7 MB view details)

Uploaded CPython 3.7m Windows x86

tokenizers-0.8.0rc4-cp37-cp37m-manylinux1_x86_64.whl (3.0 MB view details)

Uploaded CPython 3.7m

tokenizers-0.8.0rc4-cp37-cp37m-macosx_10_10_x86_64.whl (2.1 MB view details)

Uploaded CPython 3.7m macOS 10.10+ x86-64

tokenizers-0.8.0rc4-cp36-cp36m-win_amd64.whl (1.9 MB view details)

Uploaded CPython 3.6m Windows x86-64

tokenizers-0.8.0rc4-cp36-cp36m-win32.whl (1.7 MB view details)

Uploaded CPython 3.6m Windows x86

tokenizers-0.8.0rc4-cp36-cp36m-manylinux1_x86_64.whl (3.0 MB view details)

Uploaded CPython 3.6m

tokenizers-0.8.0rc4-cp36-cp36m-macosx_10_10_x86_64.whl (2.1 MB view details)

Uploaded CPython 3.6m macOS 10.10+ x86-64

tokenizers-0.8.0rc4-cp35-cp35m-win_amd64.whl (1.9 MB view details)

Uploaded CPython 3.5m Windows x86-64

tokenizers-0.8.0rc4-cp35-cp35m-win32.whl (1.7 MB view details)

Uploaded CPython 3.5m Windows x86

tokenizers-0.8.0rc4-cp35-cp35m-manylinux1_x86_64.whl (3.0 MB view details)

Uploaded CPython 3.5m

tokenizers-0.8.0rc4-cp35-cp35m-macosx_10_10_x86_64.whl (2.1 MB view details)

Uploaded CPython 3.5m macOS 10.10+ x86-64

File details

Details for the file tokenizers-0.8.0rc4.tar.gz.

File metadata

  • Download URL: tokenizers-0.8.0rc4.tar.gz
  • Upload date:
  • Size: 97.0 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/41.2.0 requests-toolbelt/0.9.1 tqdm/4.46.1 CPython/3.8.3

File hashes

Hashes for tokenizers-0.8.0rc4.tar.gz
Algorithm Hash digest
SHA256 133a61fd6f5d1623fd21b7be857ca9a56ba18bd98c157895b3c1b3ac9df0df28
MD5 ad88976503c651018f7a41ec0f50f993
BLAKE2b-256 1a10881a8c4b78499412c8a00b93457b484cf877e40f78c30f554fbd7e613fdd

See more details on using hashes here.

File details

Details for the file tokenizers-0.8.0rc4-cp38-cp38-win_amd64.whl.

File metadata

  • Download URL: tokenizers-0.8.0rc4-cp38-cp38-win_amd64.whl
  • Upload date:
  • Size: 1.9 MB
  • Tags: CPython 3.8, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/41.2.0 requests-toolbelt/0.9.1 tqdm/4.46.1 CPython/3.8.3

File hashes

Hashes for tokenizers-0.8.0rc4-cp38-cp38-win_amd64.whl
Algorithm Hash digest
SHA256 30311102f58cd67f891f1fc13f524a61ebb0d12287452b61e12a93d5252bc0bd
MD5 e533af4f35aa01a338fd0f3dc15b789e
BLAKE2b-256 2e7878b41d41f7fc882cd61876ecc8a49761a1e530f1eb926becfa2b818220da

See more details on using hashes here.

File details

Details for the file tokenizers-0.8.0rc4-cp38-cp38-win32.whl.

File metadata

  • Download URL: tokenizers-0.8.0rc4-cp38-cp38-win32.whl
  • Upload date:
  • Size: 1.7 MB
  • Tags: CPython 3.8, Windows x86
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/41.2.0 requests-toolbelt/0.9.1 tqdm/4.46.1 CPython/3.8.3

File hashes

Hashes for tokenizers-0.8.0rc4-cp38-cp38-win32.whl
Algorithm Hash digest
SHA256 6724dffebec4434ef6e019f415c4b17394797f83da351c04b2dfc6ce9c348c59
MD5 0faf4d8f993adb148b124afbaad243f6
BLAKE2b-256 ef65089aefd5477b2adad7a1c22a890d4cc7b5c1f7609e9dfeb57d14392b1ef0

See more details on using hashes here.

File details

Details for the file tokenizers-0.8.0rc4-cp38-cp38-manylinux1_x86_64.whl.

File metadata

  • Download URL: tokenizers-0.8.0rc4-cp38-cp38-manylinux1_x86_64.whl
  • Upload date:
  • Size: 3.0 MB
  • Tags: CPython 3.8
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/41.2.0 requests-toolbelt/0.9.1 tqdm/4.46.1 CPython/3.8.3

File hashes

Hashes for tokenizers-0.8.0rc4-cp38-cp38-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 cb6785ea0dc8625a4bffe3d53bc8df94bdaad7e32f066db4a600f1bfef619cc5
MD5 a2988fbf86620f8a64656ecb19d8fc1d
BLAKE2b-256 e8cac99ec67fa88af7987f8a622d3d03941304105a2773bec59baebe32fd2c41

See more details on using hashes here.

File details

Details for the file tokenizers-0.8.0rc4-cp38-cp38-macosx_10_10_x86_64.whl.

File metadata

  • Download URL: tokenizers-0.8.0rc4-cp38-cp38-macosx_10_10_x86_64.whl
  • Upload date:
  • Size: 2.1 MB
  • Tags: CPython 3.8, macOS 10.10+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/41.2.0 requests-toolbelt/0.9.1 tqdm/4.46.1 CPython/3.8.3

File hashes

Hashes for tokenizers-0.8.0rc4-cp38-cp38-macosx_10_10_x86_64.whl
Algorithm Hash digest
SHA256 6e1b33603ed9df26c5a8af8029c70d2cbd5d622beb678af1babed1b00561280d
MD5 b7586feeab3e0c1bb2200278af787d2b
BLAKE2b-256 12bd4264f650eda0f84a35dad45ba3dab1be5e1308b6599fb4a72801faa7e204

See more details on using hashes here.

File details

Details for the file tokenizers-0.8.0rc4-cp37-cp37m-win_amd64.whl.

File metadata

  • Download URL: tokenizers-0.8.0rc4-cp37-cp37m-win_amd64.whl
  • Upload date:
  • Size: 1.9 MB
  • Tags: CPython 3.7m, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/41.2.0 requests-toolbelt/0.9.1 tqdm/4.46.1 CPython/3.8.3

File hashes

Hashes for tokenizers-0.8.0rc4-cp37-cp37m-win_amd64.whl
Algorithm Hash digest
SHA256 73ffd4d3d3849f71977bfef03c920585cdd2f9f3aab3e173cfe9dfffb995e7f0
MD5 5e1a5c47c3c6baf30c948ab2a973c590
BLAKE2b-256 dd88a41502ef85240fbe6a970adf1f12c9b192f5fe6a5f5a168f2533d18fcbf6

See more details on using hashes here.

File details

Details for the file tokenizers-0.8.0rc4-cp37-cp37m-win32.whl.

File metadata

  • Download URL: tokenizers-0.8.0rc4-cp37-cp37m-win32.whl
  • Upload date:
  • Size: 1.7 MB
  • Tags: CPython 3.7m, Windows x86
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/41.2.0 requests-toolbelt/0.9.1 tqdm/4.46.1 CPython/3.8.3

File hashes

Hashes for tokenizers-0.8.0rc4-cp37-cp37m-win32.whl
Algorithm Hash digest
SHA256 1b42ccd463adf349ba00efd57d23e386cf23debb273ae7627da878c78b674d5d
MD5 2ac27aacedbfef21dba6f829f9637314
BLAKE2b-256 60f1055c574a34494b762133b20fe64e417ddcc2b8616cd2ec7e19af93ddc1d2

See more details on using hashes here.

File details

Details for the file tokenizers-0.8.0rc4-cp37-cp37m-manylinux1_x86_64.whl.

File metadata

  • Download URL: tokenizers-0.8.0rc4-cp37-cp37m-manylinux1_x86_64.whl
  • Upload date:
  • Size: 3.0 MB
  • Tags: CPython 3.7m
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/41.2.0 requests-toolbelt/0.9.1 tqdm/4.46.1 CPython/3.8.3

File hashes

Hashes for tokenizers-0.8.0rc4-cp37-cp37m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 70482161aad628c7830f42464369bbd36685d9ba2b8c8af54754309c84532a84
MD5 19ffdc68e6ef90f116883a5d21c86884
BLAKE2b-256 f7820e82a95bd9db2b32569500cc1bb47aa7c4e0f57aa5e35cceba414096917b

See more details on using hashes here.

File details

Details for the file tokenizers-0.8.0rc4-cp37-cp37m-macosx_10_10_x86_64.whl.

File metadata

  • Download URL: tokenizers-0.8.0rc4-cp37-cp37m-macosx_10_10_x86_64.whl
  • Upload date:
  • Size: 2.1 MB
  • Tags: CPython 3.7m, macOS 10.10+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/41.2.0 requests-toolbelt/0.9.1 tqdm/4.46.1 CPython/3.8.3

File hashes

Hashes for tokenizers-0.8.0rc4-cp37-cp37m-macosx_10_10_x86_64.whl
Algorithm Hash digest
SHA256 4afaaf4d4c5d53dbd804397f11bc6b08e9d4bb9697b1348823c5a4f778344d37
MD5 7aee98fc8d2c594744b4e0bb8795de65
BLAKE2b-256 75ee1a16fd2e2f92010cc62663af2ec8f371d0875f93c357e81f0dc51d26f680

See more details on using hashes here.

File details

Details for the file tokenizers-0.8.0rc4-cp36-cp36m-win_amd64.whl.

File metadata

  • Download URL: tokenizers-0.8.0rc4-cp36-cp36m-win_amd64.whl
  • Upload date:
  • Size: 1.9 MB
  • Tags: CPython 3.6m, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/41.2.0 requests-toolbelt/0.9.1 tqdm/4.46.1 CPython/3.8.3

File hashes

Hashes for tokenizers-0.8.0rc4-cp36-cp36m-win_amd64.whl
Algorithm Hash digest
SHA256 81aa76ef12de4b7a57df03b03daeabf0e2ad937b9fc8f933ef532e4602414bd3
MD5 de47c02f415bf90cbd18fa02f2945a93
BLAKE2b-256 b42f47f61ca2272e2f5e91be376443f649e6437e6397dad5a7e2467e4b653a89

See more details on using hashes here.

File details

Details for the file tokenizers-0.8.0rc4-cp36-cp36m-win32.whl.

File metadata

  • Download URL: tokenizers-0.8.0rc4-cp36-cp36m-win32.whl
  • Upload date:
  • Size: 1.7 MB
  • Tags: CPython 3.6m, Windows x86
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/41.2.0 requests-toolbelt/0.9.1 tqdm/4.46.1 CPython/3.8.3

File hashes

Hashes for tokenizers-0.8.0rc4-cp36-cp36m-win32.whl
Algorithm Hash digest
SHA256 0977f3bac30457c514a3ac3e5bdaf585b36aec49d9a96954faeca1334805f1fd
MD5 fca0e7a0abe1f8ee114e1ae1e2cefa8e
BLAKE2b-256 e3248f58bede115b584b12a76ffdde267844b1e81c57e062f7e064b3b797a4d4

See more details on using hashes here.

File details

Details for the file tokenizers-0.8.0rc4-cp36-cp36m-manylinux1_x86_64.whl.

File metadata

  • Download URL: tokenizers-0.8.0rc4-cp36-cp36m-manylinux1_x86_64.whl
  • Upload date:
  • Size: 3.0 MB
  • Tags: CPython 3.6m
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/41.2.0 requests-toolbelt/0.9.1 tqdm/4.46.1 CPython/3.8.3

File hashes

Hashes for tokenizers-0.8.0rc4-cp36-cp36m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 a9fe335e52635cb423364f39e7f93c6bf32fbb1349e226db767c8e09cebc3bc1
MD5 be6550e8dbc7cbe8105253054a27cedf
BLAKE2b-256 e8bde5abec46af977c8a1375c1dca7cb1e5b3ec392ef279067af7f6bc50491a0

See more details on using hashes here.

File details

Details for the file tokenizers-0.8.0rc4-cp36-cp36m-macosx_10_10_x86_64.whl.

File metadata

  • Download URL: tokenizers-0.8.0rc4-cp36-cp36m-macosx_10_10_x86_64.whl
  • Upload date:
  • Size: 2.1 MB
  • Tags: CPython 3.6m, macOS 10.10+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/41.2.0 requests-toolbelt/0.9.1 tqdm/4.46.1 CPython/3.8.3

File hashes

Hashes for tokenizers-0.8.0rc4-cp36-cp36m-macosx_10_10_x86_64.whl
Algorithm Hash digest
SHA256 2f648405b3b0a6ac26bab34fb0a2972454d8ce86bd2ad7191316460f2731eaee
MD5 7d95d75ee3d4dde690b2ca1b608c9d94
BLAKE2b-256 d21f3c32c57a428eed9e468a72458b6f448cb23f85f8352941f861dbcce28a3f

See more details on using hashes here.

File details

Details for the file tokenizers-0.8.0rc4-cp35-cp35m-win_amd64.whl.

File metadata

  • Download URL: tokenizers-0.8.0rc4-cp35-cp35m-win_amd64.whl
  • Upload date:
  • Size: 1.9 MB
  • Tags: CPython 3.5m, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/41.2.0 requests-toolbelt/0.9.1 tqdm/4.46.1 CPython/3.8.3

File hashes

Hashes for tokenizers-0.8.0rc4-cp35-cp35m-win_amd64.whl
Algorithm Hash digest
SHA256 e1afc149b31148bb1372bfa7565d30a79e93c8075d02ec27198432433e8da866
MD5 adb0691719c10ffc12c8957d76ec3e44
BLAKE2b-256 119c124402546ead797ef8b55382aac80b04682bd50b874e267440f54642151e

See more details on using hashes here.

File details

Details for the file tokenizers-0.8.0rc4-cp35-cp35m-win32.whl.

File metadata

  • Download URL: tokenizers-0.8.0rc4-cp35-cp35m-win32.whl
  • Upload date:
  • Size: 1.7 MB
  • Tags: CPython 3.5m, Windows x86
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/41.2.0 requests-toolbelt/0.9.1 tqdm/4.46.1 CPython/3.8.3

File hashes

Hashes for tokenizers-0.8.0rc4-cp35-cp35m-win32.whl
Algorithm Hash digest
SHA256 69764a5b533e14ee50046409c4fb30d1235b1c3224a7e22635042329efc0f9a3
MD5 a19d7332934d9ecc64e8bf95ab87b70b
BLAKE2b-256 941e85d06b04ed9c31d08987235364c1d66a24064770931c3d486c1cfe094e17

See more details on using hashes here.

File details

Details for the file tokenizers-0.8.0rc4-cp35-cp35m-manylinux1_x86_64.whl.

File metadata

  • Download URL: tokenizers-0.8.0rc4-cp35-cp35m-manylinux1_x86_64.whl
  • Upload date:
  • Size: 3.0 MB
  • Tags: CPython 3.5m
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/41.2.0 requests-toolbelt/0.9.1 tqdm/4.46.1 CPython/3.8.3

File hashes

Hashes for tokenizers-0.8.0rc4-cp35-cp35m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 032c6e3ce55671b2f0f12e40dab65de570f94b0cc9739f3adb506f43838574f5
MD5 c00e6381b415fad156184302dc9a59f4
BLAKE2b-256 8704dc862c949f02f5bb9af6c43d9c22a99ed3cc4255d4c36d23374fcbc78549

See more details on using hashes here.

File details

Details for the file tokenizers-0.8.0rc4-cp35-cp35m-macosx_10_10_x86_64.whl.

File metadata

  • Download URL: tokenizers-0.8.0rc4-cp35-cp35m-macosx_10_10_x86_64.whl
  • Upload date:
  • Size: 2.1 MB
  • Tags: CPython 3.5m, macOS 10.10+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/41.2.0 requests-toolbelt/0.9.1 tqdm/4.46.1 CPython/3.8.3

File hashes

Hashes for tokenizers-0.8.0rc4-cp35-cp35m-macosx_10_10_x86_64.whl
Algorithm Hash digest
SHA256 26d4be3e9c8e7ac1dce18204a4ffe9e3ea8ee43d450de6a9c643801bfa8df890
MD5 b0fed6d9220f3ef797d26fda3436b033
BLAKE2b-256 92b88d3b2a8ae905afdf2f99e248df8523e4b2b2b890e4d95b598edd576504cf

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page