Skip to main content

Threaded directory iteration via os.scandir() with progress indicator and resume function.

Project description

IterFilesystem

Multiprocess directory iteration via os.scandir():

  • “stats” processes:

    • only counts up all directories and files.

    • accumulates the sizes of all files.

  • “worker” process:

    • Filesystem walk and process the real action with dir/files

among other things these packages are used:

Requirement:

  • Python 3.6 or newer.

  • Pipenv. Packages and virtual environment manager.

Please: try, fork and contribute! ;)

Build Status on travis-ci.org

travis-ci.org/jedie/IterFilesystem

Build Status on appveyor.com

ci.appveyor.com/project/jedie/IterFilesystem

Coverage Status on codecov.io

codecov.io/gh/jedie/IterFilesystem

Coverage Status on coveralls.io

coveralls.io/r/jedie/IterFilesystem

Requirements Status on requires.io

requires.io/github/jedie/IterFilesystem/requirements/

Example

Use example CLI, e.g.:

~$ git clone https://github.com/jedie/IterFilesystem.git
~$ cd IterFilesystem
~/IterFilesystem$ pipenv install
~/IterFilesystem$ pipenv shell
(IterFilesystem) ~/IterFilesystem$ print_fs_stats --help
(IterFilesystem) ~/IterFilesystem$ pip install -e .
...
Successfully installed iterfilesystem

(IterFilesystem) ~/IterFilesystem$ $ print_fs_stats --help
usage: print_fs_stats.py [-h] [-v] [--path PATH]
                         [--skip_dirs [SKIP_DIRS [SKIP_DIRS ...]]]
                         [--skip_filenames [SKIP_FILENAMES [SKIP_FILENAMES ...]]]

Scan filesystem and print some information

optional arguments:
  -h, --help            show this help message and exit
  -v, --version         show program's version number and exit
  --path PATH           The file path that should be scanned e.g.: "~/foobar/"
                        default is "~"
  --skip_dirs [SKIP_DIRS [SKIP_DIRS ...]]
                        Directory names to exclude from scan.
  --skip_filenames [SKIP_FILENAMES [SKIP_FILENAMES ...]]
                        File names to ignore.

example output looks like this:

(IterFilesystem) ~/IterFilesystem$ $ print_fs_stats --path ~/IterFilesystem --skip_dirs .tox .pytest_cache
$ print_fs_stats --path ~/repos/IterFilesystem --skip_dirs .tox .pytest_cache
Read/process: '~/repos/IterFilesystem'...
Skip directories:
    * .tox
    * .pytest_cache

No files will be skipped.

...

Filesystem items..: 100%|██████████████████████████████████████████|633/633 13185.18entries/s [00:00<00:00, 13185.18entries/s]
File sizes........: 100%|█████████████████████████████████████████████████████████████|2.22M/2.22M [00:00<00:00, 48.6MBytes/s]
Average progress..: 100%|████████████████████████████████████████████████████████████████████████████████████████|00:00<00:00
Current File......:, ~/repos/IterFilesystem/Pipfile

Processed 633 filesystem items in 0.06 sec
SHA515 hash calculated over all file content: 79f2b0587e147b1c7d8581ea3597039a9e6d0c79ff10ea3bfd499cc60bc48892507437dd00da3c311280b4305c75459dbb122ebbec6b3b0445ce595b47c9f4a8
File count: 428
Total file size: 2.2 MB

History

  • dev - compare v1.0.0…master

    • TBC

  • 12.10.2019 - compare v0.2.0…v1.0.0

    • refactoring:

      • don’t use persist-queue

      • switch from threading to multiprocessing

      • enhance progress display with multiple tqdm process bars

  • 15.09.2019 - compare v0.1.0…v0.2.0

    • store persist queue in temp directory

    • Don’t catch process_path_item errors, this should be made in child class

  • 15.09.2019 - compare v0.0.1…v0.1.0

    • add some project meta files and tests

    • setup CI

    • fix tests

  • 15.09.2019 - v0.0.1

    • first Release on PyPi

Donating

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

iterfilesystem-1.0.0.tar.gz (15.6 kB view details)

Uploaded Source

Built Distributions

iterfilesystem-1.0.0-py3.6.egg (17.7 kB view details)

Uploaded Source

iterfilesystem-1.0.0-py2.py3-none-any.whl (20.0 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file iterfilesystem-1.0.0.tar.gz.

File metadata

  • Download URL: iterfilesystem-1.0.0.tar.gz
  • Upload date:
  • Size: 15.6 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/2.0.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.2.0 requests-toolbelt/0.9.1 tqdm/4.36.1 CPython/3.6.8

File hashes

Hashes for iterfilesystem-1.0.0.tar.gz
Algorithm Hash digest
SHA256 dd4707b99498e8fe04fcccdefa0d4a564dabbcf47320e91b5e1bc8efa2efdfea
MD5 ac896b4ae23b034090773b2976b48fea
BLAKE2b-256 257e0fadeedf117d3be620e3c264baa037f9b590e7a6958c7ccd685c00999632

See more details on using hashes here.

Provenance

File details

Details for the file iterfilesystem-1.0.0-py3.6.egg.

File metadata

  • Download URL: iterfilesystem-1.0.0-py3.6.egg
  • Upload date:
  • Size: 17.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/2.0.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.2.0 requests-toolbelt/0.9.1 tqdm/4.36.1 CPython/3.6.8

File hashes

Hashes for iterfilesystem-1.0.0-py3.6.egg
Algorithm Hash digest
SHA256 7fc5611064d49c761ca92a6ad5b1688ce5d8766c538560021b9ce25c2b614d77
MD5 2fb2fff645677953699cd1bd57985194
BLAKE2b-256 da49ac413f0947aec9324a61aeb762d9517748340c3c410e308bd499f27ab077

See more details on using hashes here.

Provenance

File details

Details for the file iterfilesystem-1.0.0-py2.py3-none-any.whl.

File metadata

  • Download URL: iterfilesystem-1.0.0-py2.py3-none-any.whl
  • Upload date:
  • Size: 20.0 kB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/2.0.0 pkginfo/1.5.0.1 requests/2.22.0 setuptools/41.2.0 requests-toolbelt/0.9.1 tqdm/4.36.1 CPython/3.6.8

File hashes

Hashes for iterfilesystem-1.0.0-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 2bcd0f8439ece8ac8f740c32e967ba9ee852604c6d880ed6c5f95aaa3ef43d4b
MD5 3d2f6caa8b7ac12497ca3475b84a220f
BLAKE2b-256 17898ce08dfad5a4f05d3bf1c74bc8bd0985ae503d1bc340ce78907faf2e0d07

See more details on using hashes here.

Provenance

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page