Skip to main content

Threaded directory iteration via os.scandir() with progress indicator and resume function.

Project description

IterFilesystem

Multiprocess directory iteration via os.scandir():

  • “stats” processes:

    • only counts up all directories and files.

    • accumulates the sizes of all files.

  • “worker” process:

    • Filesystem walk and process the real action with dir/files

among other things these packages are used:

Requirement:

  • Python 3.6 or newer.

  • Pipenv. Packages and virtual environment manager.

Please: try, fork and contribute! ;)

Build Status on travis-ci.org

travis-ci.org/jedie/IterFilesystem

Build Status on appveyor.com

ci.appveyor.com/project/jedie/IterFilesystem

Coverage Status on codecov.io

codecov.io/gh/jedie/IterFilesystem

Coverage Status on coveralls.io

coveralls.io/r/jedie/IterFilesystem

Requirements Status on requires.io

requires.io/github/jedie/IterFilesystem/requirements/

Example

Use example CLI, e.g.:

~$ git clone https://github.com/jedie/IterFilesystem.git
~$ cd IterFilesystem
~/IterFilesystem$ pipenv install
~/IterFilesystem$ pipenv shell
(IterFilesystem) ~/IterFilesystem$ print_fs_stats --help
(IterFilesystem) ~/IterFilesystem$ pip install -e .
...
Successfully installed iterfilesystem

(IterFilesystem) ~/IterFilesystem$ $ print_fs_stats --help
usage: print_fs_stats.py [-h] [-v] [--debug] [--path PATH]
                         [--skip_dir_patterns [SKIP_DIR_PATTERNS [SKIP_DIR_PATTERNS ...]]]
                         [--skip_file_patterns [SKIP_FILE_PATTERNS [SKIP_FILE_PATTERNS ...]]]

Scan filesystem and print some information

optional arguments:
  -h, --help            show this help message and exit
  -v, --version         show program's version number and exit
  --debug               enable DEBUG
  --path PATH           The file path that should be scanned e.g.: "~/foobar/"
                        default is "~"
  --skip_dir_patterns [SKIP_DIR_PATTERNS [SKIP_DIR_PATTERNS ...]]
                        Directory names to exclude from scan.
  --skip_file_patterns [SKIP_FILE_PATTERNS [SKIP_FILE_PATTERNS ...]]
                        File names to ignore.

example output looks like this:

(IterFilesystem) ~/IterFilesystem$ $ print_fs_stats --path ~/IterFilesystem --skip_dir_patterns ".*" "*.egg-info" --skip_file_patterns ".*"
Read/process: '~/IterFilesystem'...
Skip directory patterns:
    * .*
    * *.egg-info

Skip file patterns:
    * .*

Filesystem items..:Read/process: '~/IterFilesystem'...

...

Filesystem items..: 100%|█████████████████████████████████████████|135/135 13737.14entries/s [00:00<00:00, 13737.14entries/s]
File sizes........: 100%|██████████████████████████████████████████████████████████████|843k/843k [00:00<00:00, 88.5MBytes/s]
Average progress..: 100%|███████████████████████████████████████████████████████████████████████████████████████|00:00<00:00
Current File......:, /home/jens/repos/IterFilesystem/Pipfile


Processed 135 filesystem items in 0.02 sec
SHA515 hash calculated over all file content: 10f9475b21977f5aea1d4657a0e09ad153a594ab30abc2383bf107dbc60c430928596e368ebefab3e78ede61dcc101cb638a845348fe908786cb8754393439ef
File count: 109
Total file size: 843.5 KB
6 directories skipped.
6 files skipped.

History

  • dev - compare v1.1.0…master

    • TBC

  • 12.10.2019 - compare v1.0.0…v1.1.0

    • don’t create separate process for worker: Just do the work in main process

    • dir/file filter uses now fnmatch

  • 12.10.2019 - compare v0.2.0…v1.0.0

    • refactoring:

      • don’t use persist-queue

      • switch from threading to multiprocessing

      • enhance progress display with multiple tqdm process bars

  • 15.09.2019 - compare v0.1.0…v0.2.0

    • store persist queue in temp directory

    • Don’t catch process_path_item errors, this should be made in child class

  • 15.09.2019 - compare v0.0.1…v0.1.0

    • add some project meta files and tests

    • setup CI

    • fix tests

  • 15.09.2019 - v0.0.1

    • first Release on PyPi

Donating

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

iterfilesystem-1.1.0.tar.gz (14.3 kB view details)

Uploaded Source

Built Distributions

iterfilesystem-1.1.0-py3.6.egg (16.5 kB view details)

Uploaded Source

iterfilesystem-1.1.0-py2.py3-none-any.whl (18.8 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file iterfilesystem-1.1.0.tar.gz.

File metadata

  • Download URL: iterfilesystem-1.1.0.tar.gz
  • Upload date:
  • Size: 14.3 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/2.0.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.2.0 requests-toolbelt/0.9.1 tqdm/4.36.1 CPython/3.6.8

File hashes

Hashes for iterfilesystem-1.1.0.tar.gz
Algorithm Hash digest
SHA256 b7e551c72b4a14fc553727c6fae2eead02852a364d1427f461ae9fca4fcfbbfa
MD5 a5221923463c4b8debe01459ae23c0cb
BLAKE2b-256 332836288de7bd6abfa77c1405fe3ba6fd7b6c83be3cb896ce8d1eb4d3c1c9ae

See more details on using hashes here.

Provenance

File details

Details for the file iterfilesystem-1.1.0-py3.6.egg.

File metadata

  • Download URL: iterfilesystem-1.1.0-py3.6.egg
  • Upload date:
  • Size: 16.5 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/2.0.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.2.0 requests-toolbelt/0.9.1 tqdm/4.36.1 CPython/3.6.8

File hashes

Hashes for iterfilesystem-1.1.0-py3.6.egg
Algorithm Hash digest
SHA256 b4e17455d5447c12c0a1590d5072578f130a1994b17ac625ee2f577e08b00ece
MD5 a145afe0b659944422aa49bbd9ad2734
BLAKE2b-256 9aeef0c00736d0822931dcacd9377e59eb9da7d9ca08b38370dda9c86d305ffd

See more details on using hashes here.

Provenance

File details

Details for the file iterfilesystem-1.1.0-py2.py3-none-any.whl.

File metadata

  • Download URL: iterfilesystem-1.1.0-py2.py3-none-any.whl
  • Upload date:
  • Size: 18.8 kB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/2.0.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.2.0 requests-toolbelt/0.9.1 tqdm/4.36.1 CPython/3.6.8

File hashes

Hashes for iterfilesystem-1.1.0-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 270df8edfe5347a6c0e9f202fa3de0223b6d9af933deaa7a8d6be77b4b40dbdb
MD5 8027ffc9a252f311b698fdfd2e5fb538
BLAKE2b-256 769bc25ad900f23d8139e178f2ed335c34379584bcb5ef152bafa39d4baa2576

See more details on using hashes here.

Provenance

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page