Skip to main content

Text utilities and datasets for PyTorch

Project description

https://circleci.com/gh/pytorch/text.svg?style=svg https://codecov.io/gh/pytorch/text/branch/main/graph/badge.svg https://img.shields.io/badge/dynamic/json.svg?label=docs&url=https%3A%2F%2Fpypi.org%2Fpypi%2Ftorchtext%2Fjson&query=%24.info.version&colorB=brightgreen&prefix=v

torchtext

This repository consists of:

Installation

We recommend Anaconda as a Python package management system. Please refer to pytorch.org for the details of PyTorch installation. The following are the corresponding torchtext versions and supported Python versions.

Version Compatibility

PyTorch version

torchtext version

Supported Python version

nightly build

main

>=3.7, <=3.10

1.12.0

0.13.0

>=3.7, <=3.10

1.11.0

0.12.0

>=3.6, <=3.9

1.10.0

0.11.0

>=3.6, <=3.9

1.9.1

0.10.1

>=3.6, <=3.9

1.9

0.10

>=3.6, <=3.9

1.8.2 (LTS)

0.9.2 (LTS)

>=3.6, <=3.9

1.8.1

0.9.1

>=3.6, <=3.9

1.8

0.9

>=3.6, <=3.9

1.7.1

0.8.1

>=3.6, <=3.9

1.7

0.8

>=3.6, <=3.8

1.6

0.7

>=3.6, <=3.8

1.5

0.6

>=3.5, <=3.8

1.4

0.5

2.7, >=3.5, <=3.8

0.4 and below

0.2.3

2.7, >=3.5, <=3.8

Using conda:

conda install -c pytorch torchtext

Using pip:

pip install torchtext

Note LTS versions are distributed through a different channel than the other versioned releases. Please refer to https://pytorch.org/get-started/locally/ for details.

Optional requirements

If you want to use English tokenizer from SpaCy, you need to install SpaCy and download its English model:

pip install spacy
python -m spacy download en_core_web_sm

Alternatively, you might want to use the Moses tokenizer port in SacreMoses (split from NLTK). You have to install SacreMoses:

pip install sacremoses

For torchtext 0.5 and below, sentencepiece:

conda install -c powerai sentencepiece

Building from source

To build torchtext from source, you need git, CMake and C++11 compiler such as g++.:

git clone https://github.com/pytorch/text torchtext
cd torchtext
git submodule update --init --recursive

# Linux
python setup.py clean install

# OSX
CC=clang CXX=clang++ python setup.py clean install

# or ``python setup.py develop`` if you are making modifications.

Note

When building from source, make sure that you have the same C++ compiler as the one used to build PyTorch. A simple way is to build PyTorch from source and use the same environment to build torchtext. If you are using the nightly build of PyTorch, checkout the environment it was built with conda (here) and pip (here).

Documentation

Find the documentation here.

Datasets

The datasets module currently contains:

  • Language modeling: WikiText2, WikiText103, PennTreebank, EnWik9

  • Machine translation: IWSLT2016, IWSLT2017, Multi30k

  • Sequence tagging (e.g. POS/NER): UDPOS, CoNLL2000Chunking

  • Question answering: SQuAD1, SQuAD2

  • Text classification: SST2, AG_NEWS, SogouNews, DBpedia, YelpReviewPolarity, YelpReviewFull, YahooAnswers, AmazonReviewPolarity, AmazonReviewFull, IMDB

  • Model pre-training: CC-100

Models

The library currently consist of following pre-trained models:

Tokenizers

The transforms module currently support following scriptable tokenizers:

Tutorials

To get started with torchtext, users may refer to the following tutorial available on PyTorch website.

Disclaimer on Datasets

This is a utility library that downloads and prepares public datasets. We do not host or distribute these datasets, vouch for their quality or fairness, or claim that you have license to use the dataset. It is your responsibility to determine whether you have permission to use the dataset under the dataset’s license.

If you’re a dataset owner and wish to update any part of it (description, citation, etc.), or do not want your dataset to be included in this library, please get in touch through a GitHub issue. Thanks for your contribution to the ML community!

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distributions

No source distribution files available for this release.See tutorial on generating distribution archives.

Built Distributions

torchtext-0.14.0-cp310-cp310-win_amd64.whl (1.9 MB view details)

Uploaded CPython 3.10 Windows x86-64

torchtext-0.14.0-cp310-cp310-manylinux1_x86_64.whl (2.0 MB view details)

Uploaded CPython 3.10

torchtext-0.14.0-cp310-cp310-macosx_12_0_arm64.whl (2.1 MB view details)

Uploaded CPython 3.10 macOS 12.0+ ARM64

torchtext-0.14.0-cp310-cp310-macosx_10_9_x86_64.whl (1.8 MB view details)

Uploaded CPython 3.10 macOS 10.9+ x86-64

torchtext-0.14.0-cp39-cp39-win_amd64.whl (1.9 MB view details)

Uploaded CPython 3.9 Windows x86-64

torchtext-0.14.0-cp39-cp39-manylinux1_x86_64.whl (2.0 MB view details)

Uploaded CPython 3.9

torchtext-0.14.0-cp39-cp39-macosx_12_0_arm64.whl (2.1 MB view details)

Uploaded CPython 3.9 macOS 12.0+ ARM64

torchtext-0.14.0-cp39-cp39-macosx_10_9_x86_64.whl (1.8 MB view details)

Uploaded CPython 3.9 macOS 10.9+ x86-64

torchtext-0.14.0-cp38-cp38-win_amd64.whl (1.9 MB view details)

Uploaded CPython 3.8 Windows x86-64

torchtext-0.14.0-cp38-cp38-manylinux1_x86_64.whl (2.0 MB view details)

Uploaded CPython 3.8

torchtext-0.14.0-cp38-cp38-macosx_12_0_arm64.whl (2.1 MB view details)

Uploaded CPython 3.8 macOS 12.0+ ARM64

torchtext-0.14.0-cp38-cp38-macosx_10_9_x86_64.whl (1.8 MB view details)

Uploaded CPython 3.8 macOS 10.9+ x86-64

torchtext-0.14.0-cp37-cp37m-win_amd64.whl (1.9 MB view details)

Uploaded CPython 3.7m Windows x86-64

torchtext-0.14.0-cp37-cp37m-manylinux1_x86_64.whl (2.0 MB view details)

Uploaded CPython 3.7m

torchtext-0.14.0-cp37-cp37m-macosx_10_9_x86_64.whl (1.8 MB view details)

Uploaded CPython 3.7m macOS 10.9+ x86-64

File details

Details for the file torchtext-0.14.0-cp310-cp310-win_amd64.whl.

File metadata

  • Download URL: torchtext-0.14.0-cp310-cp310-win_amd64.whl
  • Upload date:
  • Size: 1.9 MB
  • Tags: CPython 3.10, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.7.1 importlib_metadata/3.10.1 pkginfo/1.8.2 requests/2.27.1 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.8.12

File hashes

Hashes for torchtext-0.14.0-cp310-cp310-win_amd64.whl
Algorithm Hash digest
SHA256 32021ddd8e0ce3cfda0ae3aa47882f9d620b41702c631ca83e9de2706b66e991
MD5 9dc947ef077528ec237e16a295c940ad
BLAKE2b-256 238f6114d5db0c489171d99190d27dbe9cc3d8aa0cc54f85faa333621d8a4b60

See more details on using hashes here.

Provenance

File details

Details for the file torchtext-0.14.0-cp310-cp310-manylinux2014_aarch64.whl.

File metadata

  • Download URL: torchtext-0.14.0-cp310-cp310-manylinux2014_aarch64.whl
  • Upload date:
  • Size: 1.9 MB
  • Tags: CPython 3.10
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.7.1 importlib_metadata/3.10.1 pkginfo/1.8.2 requests/2.27.1 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.8.12

File hashes

Hashes for torchtext-0.14.0-cp310-cp310-manylinux2014_aarch64.whl
Algorithm Hash digest
SHA256 9bc5e610bd16e6bf882434b84827750dbb12f029ab358f7ccdd6208ae6af4e8d
MD5 7acf4c66b344b6be76008c129e278f65
BLAKE2b-256 37dbd4f4c9c0cedf6c3d4bf855aeab5487f92a1ffc67c4d2972bf106bc07b3f6

See more details on using hashes here.

Provenance

File details

Details for the file torchtext-0.14.0-cp310-cp310-manylinux1_x86_64.whl.

File metadata

  • Download URL: torchtext-0.14.0-cp310-cp310-manylinux1_x86_64.whl
  • Upload date:
  • Size: 2.0 MB
  • Tags: CPython 3.10
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.7.1 importlib_metadata/3.10.1 pkginfo/1.8.2 requests/2.27.1 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.8.12

File hashes

Hashes for torchtext-0.14.0-cp310-cp310-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 940a899d39650270091cc099e43458c908e0f5e97908486d38f37b2c732d5efe
MD5 345a86492edffc783309a464b92237fe
BLAKE2b-256 c88bf6fae8d0b525bb9530cf8e59c7a30c6b06a0f5cfd95330ba35c04d434f0d

See more details on using hashes here.

Provenance

File details

Details for the file torchtext-0.14.0-cp310-cp310-macosx_12_0_arm64.whl.

File metadata

  • Download URL: torchtext-0.14.0-cp310-cp310-macosx_12_0_arm64.whl
  • Upload date:
  • Size: 2.1 MB
  • Tags: CPython 3.10, macOS 12.0+ ARM64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.7.1 importlib_metadata/3.10.1 pkginfo/1.8.2 requests/2.27.1 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.8.12

File hashes

Hashes for torchtext-0.14.0-cp310-cp310-macosx_12_0_arm64.whl
Algorithm Hash digest
SHA256 2d4720fe5bac3202537fd4ba1b43f3c7fde9dcd3018d57fc449d11b4f73ae93a
MD5 1398a1b795015be49a946965ae525a39
BLAKE2b-256 b57a4d4b832b7915fc346fece4f434224e1b7a99627bd6d00fe89b07cb47b5ec

See more details on using hashes here.

Provenance

File details

Details for the file torchtext-0.14.0-cp310-cp310-macosx_10_9_x86_64.whl.

File metadata

  • Download URL: torchtext-0.14.0-cp310-cp310-macosx_10_9_x86_64.whl
  • Upload date:
  • Size: 1.8 MB
  • Tags: CPython 3.10, macOS 10.9+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.7.1 importlib_metadata/3.10.1 pkginfo/1.8.2 requests/2.27.1 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.8.12

File hashes

Hashes for torchtext-0.14.0-cp310-cp310-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 34da232d2180ac7366ddc64468eb1f0e98f38cbda5afb2570f99061a276af6cf
MD5 e102ab3c40c9431b4fab54a45d94a7ac
BLAKE2b-256 1c3a4cf24bb06f9b9270b7f4b8fb6d5eb01254213302fe0e8008e9691a7927e5

See more details on using hashes here.

Provenance

File details

Details for the file torchtext-0.14.0-cp39-cp39-win_amd64.whl.

File metadata

  • Download URL: torchtext-0.14.0-cp39-cp39-win_amd64.whl
  • Upload date:
  • Size: 1.9 MB
  • Tags: CPython 3.9, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.7.1 importlib_metadata/3.10.1 pkginfo/1.8.2 requests/2.27.1 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.8.12

File hashes

Hashes for torchtext-0.14.0-cp39-cp39-win_amd64.whl
Algorithm Hash digest
SHA256 50cb3313e39c03c3dec00686a0379df1dbac525135d3761af96baf9c40d67606
MD5 cc8dcf71d773b950b96004895f871f30
BLAKE2b-256 b9e1a46bdac486b51bb64eeaea5c3df3709feaead6b02657d2f423c048543c2b

See more details on using hashes here.

Provenance

File details

Details for the file torchtext-0.14.0-cp39-cp39-manylinux1_x86_64.whl.

File metadata

  • Download URL: torchtext-0.14.0-cp39-cp39-manylinux1_x86_64.whl
  • Upload date:
  • Size: 2.0 MB
  • Tags: CPython 3.9
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.7.1 importlib_metadata/3.10.1 pkginfo/1.8.2 requests/2.27.1 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.8.12

File hashes

Hashes for torchtext-0.14.0-cp39-cp39-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 2abb873d1466f9548e5b790f1f720dddd67cc67d785e9709a7d1cc975699f7fa
MD5 62594dfa2d5655f2c5de6d7ac258c22b
BLAKE2b-256 46f94216d87f127c50da8acd3442da77825c634ca5ac0e8133030ed4b0f6fc70

See more details on using hashes here.

Provenance

File details

Details for the file torchtext-0.14.0-cp39-cp39-macosx_12_0_arm64.whl.

File metadata

  • Download URL: torchtext-0.14.0-cp39-cp39-macosx_12_0_arm64.whl
  • Upload date:
  • Size: 2.1 MB
  • Tags: CPython 3.9, macOS 12.0+ ARM64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.7.1 importlib_metadata/3.10.1 pkginfo/1.8.2 requests/2.27.1 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.8.12

File hashes

Hashes for torchtext-0.14.0-cp39-cp39-macosx_12_0_arm64.whl
Algorithm Hash digest
SHA256 f5d17e57b7652db98a7c0c786e29579e5344ed7abbcbbb84c69e47aa56c03dc3
MD5 2cb02a501785c1a3d144db0aae0ed2c2
BLAKE2b-256 21ce362b99c5a26d0ce94eef86b38c4d980f9790a17021c19fa66df61c1c3490

See more details on using hashes here.

Provenance

File details

Details for the file torchtext-0.14.0-cp39-cp39-macosx_10_9_x86_64.whl.

File metadata

  • Download URL: torchtext-0.14.0-cp39-cp39-macosx_10_9_x86_64.whl
  • Upload date:
  • Size: 1.8 MB
  • Tags: CPython 3.9, macOS 10.9+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.7.1 importlib_metadata/3.10.1 pkginfo/1.8.2 requests/2.27.1 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.8.12

File hashes

Hashes for torchtext-0.14.0-cp39-cp39-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 4c41eb544c21c2abe0ef979256e4eb166a763213966e434d24841bfc6c265a4a
MD5 b38565b3ad3fef6397c64f6611325adc
BLAKE2b-256 8b9a2641bf25d7557445b4c10575b004d57ba7e38a979e161cc1545791357657

See more details on using hashes here.

Provenance

File details

Details for the file torchtext-0.14.0-cp38-cp38-win_amd64.whl.

File metadata

  • Download URL: torchtext-0.14.0-cp38-cp38-win_amd64.whl
  • Upload date:
  • Size: 1.9 MB
  • Tags: CPython 3.8, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.7.1 importlib_metadata/3.10.1 pkginfo/1.8.2 requests/2.27.1 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.8.12

File hashes

Hashes for torchtext-0.14.0-cp38-cp38-win_amd64.whl
Algorithm Hash digest
SHA256 b3ac9720087abd47f449f4de7054cda3837bd72b1b9dd659662d778ab1b77f93
MD5 3bb6a000524c63cf79f3552e1734f6bd
BLAKE2b-256 09b1c817fd1fbfa3e30047bb592bd146f5b817907c97e86997ce91613f44e0f4

See more details on using hashes here.

Provenance

File details

Details for the file torchtext-0.14.0-cp38-cp38-manylinux1_x86_64.whl.

File metadata

  • Download URL: torchtext-0.14.0-cp38-cp38-manylinux1_x86_64.whl
  • Upload date:
  • Size: 2.0 MB
  • Tags: CPython 3.8
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.7.1 importlib_metadata/3.10.1 pkginfo/1.8.2 requests/2.27.1 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.8.12

File hashes

Hashes for torchtext-0.14.0-cp38-cp38-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 0a9dc1c0b9a4f19244313cef4bc83f462bd2d67065dbe0637face9670338826f
MD5 1e1632fdc6a7ae378460a8db6e761d87
BLAKE2b-256 256e786fc633a8454debd7d5fd3069bee5bc65b690ac3db4962293b01fae73fa

See more details on using hashes here.

Provenance

File details

Details for the file torchtext-0.14.0-cp38-cp38-macosx_12_0_arm64.whl.

File metadata

  • Download URL: torchtext-0.14.0-cp38-cp38-macosx_12_0_arm64.whl
  • Upload date:
  • Size: 2.1 MB
  • Tags: CPython 3.8, macOS 12.0+ ARM64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.7.1 importlib_metadata/3.10.1 pkginfo/1.8.2 requests/2.27.1 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.8.12

File hashes

Hashes for torchtext-0.14.0-cp38-cp38-macosx_12_0_arm64.whl
Algorithm Hash digest
SHA256 9f671e1e406b76faa16ce5cd5cb7a78a1a8802b705a96f6dc1b4d80406a15f2d
MD5 0c49846751e8c3d21c039901d7d9cf34
BLAKE2b-256 0c19f114c4fec31de6a7f2278907e4f08d4df8e0c1ac8a8c969d0da9121012c5

See more details on using hashes here.

Provenance

File details

Details for the file torchtext-0.14.0-cp38-cp38-macosx_10_9_x86_64.whl.

File metadata

  • Download URL: torchtext-0.14.0-cp38-cp38-macosx_10_9_x86_64.whl
  • Upload date:
  • Size: 1.8 MB
  • Tags: CPython 3.8, macOS 10.9+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.7.1 importlib_metadata/3.10.1 pkginfo/1.8.2 requests/2.27.1 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.8.12

File hashes

Hashes for torchtext-0.14.0-cp38-cp38-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 8a700b824fdcc9db96a37d7e54bc9621353835ed5436b3f848c12657edeecf3f
MD5 789483a1a79b89fb2780ea2efab068b0
BLAKE2b-256 204899c13e791862c0fac62f3360f858cc01cc7e7577421cf32597c177c518f9

See more details on using hashes here.

Provenance

File details

Details for the file torchtext-0.14.0-cp37-cp37m-win_amd64.whl.

File metadata

  • Download URL: torchtext-0.14.0-cp37-cp37m-win_amd64.whl
  • Upload date:
  • Size: 1.9 MB
  • Tags: CPython 3.7m, Windows x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.7.1 importlib_metadata/3.10.1 pkginfo/1.8.2 requests/2.27.1 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.8.12

File hashes

Hashes for torchtext-0.14.0-cp37-cp37m-win_amd64.whl
Algorithm Hash digest
SHA256 10a01f90e7c0d02b995bc06499c4dbcd581d721f74c0ffb7a735cc1df66d92d5
MD5 f0d076cc14b0e4c0da53f5be94110ef7
BLAKE2b-256 4ae45a78de4448fdbb614956f304b2f62a894d8b77ac3e8c6cd3c800b611e67c

See more details on using hashes here.

Provenance

File details

Details for the file torchtext-0.14.0-cp37-cp37m-manylinux1_x86_64.whl.

File metadata

  • Download URL: torchtext-0.14.0-cp37-cp37m-manylinux1_x86_64.whl
  • Upload date:
  • Size: 2.0 MB
  • Tags: CPython 3.7m
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.7.1 importlib_metadata/3.10.1 pkginfo/1.8.2 requests/2.27.1 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.8.12

File hashes

Hashes for torchtext-0.14.0-cp37-cp37m-manylinux1_x86_64.whl
Algorithm Hash digest
SHA256 08a6f390a4d6c93069db8be677a222c75658cd6caccc6c3a246c410938dcf871
MD5 1e7f8090182dd048170ebf28d8db5af7
BLAKE2b-256 557c8bbb26bad7214aa040495880053a46b79fb72551ee4c1a71fc402c1e87c9

See more details on using hashes here.

Provenance

File details

Details for the file torchtext-0.14.0-cp37-cp37m-macosx_10_9_x86_64.whl.

File metadata

  • Download URL: torchtext-0.14.0-cp37-cp37m-macosx_10_9_x86_64.whl
  • Upload date:
  • Size: 1.8 MB
  • Tags: CPython 3.7m, macOS 10.9+ x86-64
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.7.1 importlib_metadata/3.10.1 pkginfo/1.8.2 requests/2.27.1 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.8.12

File hashes

Hashes for torchtext-0.14.0-cp37-cp37m-macosx_10_9_x86_64.whl
Algorithm Hash digest
SHA256 ffece1bf41757f9432146eba4b5dea9fd5e9ef54f88c9c25a90a0150056475bb
MD5 11b6c33b89361a46c779b3263259f822
BLAKE2b-256 99d1dc38a707d2bb82eb04a67dc4357141d29d9d87f424ab58908274a7f0f9af

See more details on using hashes here.

Provenance

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page