HuggingFace library to process and filter large amounts of webdata
Project description
datatrove
Installation
pip install -e ".[dev]"
Install pre-commit code style hooks:
pre-commit install
Run the tests:
pytest -n 4 --max-worker-restart=0 --dist=loadfile tests
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
datatrove-0.0.1.dev0.tar.gz
(2.6 kB
view details)
Built Distribution
File details
Details for the file datatrove-0.0.1.dev0.tar.gz
.
File metadata
- Download URL: datatrove-0.0.1.dev0.tar.gz
- Upload date:
- Size: 2.6 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.10.12
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 3ce517818ebd3abaf0f2440fd8b47fe256127e51d8453e517db51036923cb5b6 |
|
MD5 | 7653ff33aa5c7d67372580b972a1e29b |
|
BLAKE2b-256 | 3a02e3fc0ace94caca751b7c89ffc78307a2ceee26de89f0db018076e8b9ca49 |
File details
Details for the file datatrove-0.0.1.dev0-py3-none-any.whl
.
File metadata
- Download URL: datatrove-0.0.1.dev0-py3-none-any.whl
- Upload date:
- Size: 2.3 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.10.12
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 6ee4652f936197d3ad3e6a831827dfb788fd3b1d2cad31434caec39743f5cf63 |
|
MD5 | cd4c7e52de3e2b8159c09ffedd84e499 |
|
BLAKE2b-256 | d39c9edcdf7a95fdbe4de6db47282159edcfd70e40a3e7b5a200fbd534e0e79e |