Threaded directory iteration via os.scandir() with progress indicator and resume function.
Project description
IterFilesystem
Multiprocess directory iteration via os.scandir():
“stats” processes:
only counts up all directories and files.
accumulates the sizes of all files.
“worker” process:
Filesystem walk and process the real action with dir/files
among other things these packages are used:
progress bar tqdm
Requirement:
Python 3.6 or newer.
Pipenv. Packages and virtual environment manager.
Please: try, fork and contribute! ;)
Example
Use example CLI, e.g.:
~$ git clone https://github.com/jedie/IterFilesystem.git ~$ cd IterFilesystem ~/IterFilesystem$ pipenv install ~/IterFilesystem$ pipenv shell (IterFilesystem) ~/IterFilesystem$ print_fs_stats --help (IterFilesystem) ~/IterFilesystem$ pip install -e . ... Successfully installed iterfilesystem (IterFilesystem) ~/IterFilesystem$ $ print_fs_stats --help usage: print_fs_stats.py [-h] [-v] [--debug] [--path PATH] [--skip_dir_patterns [SKIP_DIR_PATTERNS [SKIP_DIR_PATTERNS ...]]] [--skip_file_patterns [SKIP_FILE_PATTERNS [SKIP_FILE_PATTERNS ...]]] Scan filesystem and print some information optional arguments: -h, --help show this help message and exit -v, --version show program's version number and exit --debug enable DEBUG --path PATH The file path that should be scanned e.g.: "~/foobar/" default is "~" --skip_dir_patterns [SKIP_DIR_PATTERNS [SKIP_DIR_PATTERNS ...]] Directory names to exclude from scan. --skip_file_patterns [SKIP_FILE_PATTERNS [SKIP_FILE_PATTERNS ...]] File names to ignore.
example output looks like this:
(IterFilesystem) ~/IterFilesystem$ $ print_fs_stats --path ~/IterFilesystem --skip_dir_patterns ".*" "*.egg-info" --skip_file_patterns ".*" Read/process: '~/IterFilesystem'... Skip directory patterns: * .* * *.egg-info Skip file patterns: * .* Filesystem items..:Read/process: '~/IterFilesystem'... ... Filesystem items..: 100%|█████████████████████████████████████████|135/135 13737.14entries/s [00:00<00:00, 13737.14entries/s] File sizes........: 100%|██████████████████████████████████████████████████████████████|843k/843k [00:00<00:00, 88.5MBytes/s] Average progress..: 100%|███████████████████████████████████████████████████████████████████████████████████████|00:00<00:00 Current File......:, /home/jens/repos/IterFilesystem/Pipfile Processed 135 filesystem items in 0.02 sec SHA515 hash calculated over all file content: 10f9475b21977f5aea1d4657a0e09ad153a594ab30abc2383bf107dbc60c430928596e368ebefab3e78ede61dcc101cb638a845348fe908786cb8754393439ef File count: 109 Total file size: 843.5 KB 6 directories skipped. 6 files skipped.
History
dev - compare v1.1.0…master
TBC
12.10.2019 - compare v1.0.0…v1.1.0
don’t create separate process for worker: Just do the work in main process
dir/file filter uses now fnmatch
12.10.2019 - compare v0.2.0…v1.0.0
refactoring:
don’t use persist-queue
switch from threading to multiprocessing
enhance progress display with multiple tqdm process bars
15.09.2019 - compare v0.1.0…v0.2.0
store persist queue in temp directory
Don’t catch process_path_item errors, this should be made in child class
15.09.2019 - compare v0.0.1…v0.1.0
add some project meta files and tests
setup CI
fix tests
15.09.2019 - v0.0.1
first Release on PyPi
Links
Donating
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distributions
File details
Details for the file iterfilesystem-1.1.0.tar.gz
.
File metadata
- Download URL: iterfilesystem-1.1.0.tar.gz
- Upload date:
- Size: 14.3 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/2.0.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.2.0 requests-toolbelt/0.9.1 tqdm/4.36.1 CPython/3.6.8
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | b7e551c72b4a14fc553727c6fae2eead02852a364d1427f461ae9fca4fcfbbfa |
|
MD5 | a5221923463c4b8debe01459ae23c0cb |
|
BLAKE2b-256 | 332836288de7bd6abfa77c1405fe3ba6fd7b6c83be3cb896ce8d1eb4d3c1c9ae |
Provenance
File details
Details for the file iterfilesystem-1.1.0-py3.6.egg
.
File metadata
- Download URL: iterfilesystem-1.1.0-py3.6.egg
- Upload date:
- Size: 16.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/2.0.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.2.0 requests-toolbelt/0.9.1 tqdm/4.36.1 CPython/3.6.8
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | b4e17455d5447c12c0a1590d5072578f130a1994b17ac625ee2f577e08b00ece |
|
MD5 | a145afe0b659944422aa49bbd9ad2734 |
|
BLAKE2b-256 | 9aeef0c00736d0822931dcacd9377e59eb9da7d9ca08b38370dda9c86d305ffd |
Provenance
File details
Details for the file iterfilesystem-1.1.0-py2.py3-none-any.whl
.
File metadata
- Download URL: iterfilesystem-1.1.0-py2.py3-none-any.whl
- Upload date:
- Size: 18.8 kB
- Tags: Python 2, Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/2.0.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.2.0 requests-toolbelt/0.9.1 tqdm/4.36.1 CPython/3.6.8
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 270df8edfe5347a6c0e9f202fa3de0223b6d9af933deaa7a8d6be77b4b40dbdb |
|
MD5 | 8027ffc9a252f311b698fdfd2e5fb538 |
|
BLAKE2b-256 | 769bc25ad900f23d8139e178f2ed335c34379584bcb5ef152bafa39d4baa2576 |