Skip to main content

from hansel import Crumb to find your file path.

Project description

hansel

Flexible parametric file paths to make queries, build folder trees and smart folder structure access.

Build Status Coverage Status

Usage

Quick Intro

Imagine this folder tree:

data
└── raw
    ├── 0040000
    │   └── session_1
    │       ├── anat_1
    │       └── rest_1
    ├── 0040001
    │   └── session_1
    │       ├── anat_1
    │       └── rest_1
    ├── 0040002
    │   └── session_1
    │       ├── anat_1
    │       └── rest_1
    ├── 0040003
    │   └── session_1
    │       ├── anat_1
    │       └── rest_1
    ├── 0040004
    │   └── session_1
    │       ├── anat_1
    │       └── rest_1
from hansel import Crumb

# create the crumb
crumb = Crumb("{base_dir}/data/raw/{subject_id}/{session_id}/{image_type}")

# set the base_dir path
crumb = crumb.replace('base_dir', '/home/hansel')

assert str(crumb) == "/home/hansel/data/raw/{subject_id}/{session_id}/{image_type}"

# get the ids of the subjects
subj_ids = crumb['subject_id']

assert subj_ids == ['0040000', '0040001', '0040002', '0040003', '0040004', ....]

# get the paths to the subject folders, the output can be strings or crumbs, you select
subj_paths = crumb.ls('subject_id', make_crumbs=True)

# set the image_type
anat_crumb = crumb.replace(image_type='anat_1')

# get the paths to the anat_1 folders
anat_paths = anat_crumb.ls('image')

Long Intro

I often find myself in a work related with structured folder paths, such as the one shown above.

I have tried many ways of solving these situations: loops, dictionaries, configuration files, etc. I always end up doing a different thing for the same problem over and over again.

This week I grew tired of it and decided to make a representation of a structured folder tree in a string and access it the most easy way.

If you look at the folder structure above I have:

  • the root directory from where it is hanging: ...data/raw,

  • many identifiers (in this case a subject identification), e.g., 0040000,

  • session identification, session_1 and

  • a data type (in this case an image type), anat_1 and rest_1.

With hansel I can represent this folder structure like this:

from hansel import Crumb

crumb = Crumb("{base_dir}/data/raw/{subject_id}/{session_id}/{image_type}")
Let’s say we have the structure above hanging from a base directory

like /home/hansel/.

I can use the replace function to make set the base_dir parameter:

crumb = crumb.replace('base_dir', '/home/hansel')

assert str(crumb) == "/home/hansel/data/raw/{subject_id}/{session_id}/{image_type}"

Now that the root path of my dataset is set, I can start querying my crumb path.

If I want to know the path to the existing subject_ids folders:

subject_paths = anat_crumb.ls('subject_id')

The output of ls can be str or Crumb or pathlib.Path. They will be Path if there are no crumb arguments left in the crumb path. You can choose this using the make_crumbs argument:

subject_paths = anat_crumb.ls('subject_id', make_crumbs=True)

If I want to know what are the existing subject_ids:

subject_ids = crumb.ls('subject_id', fullpath=False)

or

subject_ids = crumb['subject_id']

Now, if I wanted to get the path to all the anat_1 images, I could do this:

anat_crumb = crumb.replace(image_type='anat_1')

anat_paths = anat_crumb.ls('image')

or

crumb['image_type'] = 'anat_1'

anat_paths = crumb.ls('image')

More functionalities, ideas and comments are welcome.

Dependencies

Please see the requirements.txt file. Before installing this package, install its dependencies with:

pip install -r requirements.txt

Install

This package uses setuptools. You can install it running:

python setup.py install

If you already have the dependencies listed in requirements.txt installed, to install in your home directory, use:

python setup.py install –user

To install for all users on Unix/Linux:

python setup.py build
sudo python setup.py install

You can also install it in development mode with:

python setup.py develop

Development

Code

Github

You can check the latest sources with the command:

git clone https://www.github.com/alexsavio/hansel.git

or if you have write privileges:

git clone git@github.com:alexsavio/hansel.git

If you are going to create patches for this project, create a branch for it from the master branch.

We tag stable releases in the repository with the version number.

Testing

We are using py.test to help us with the testing.

If you don’t have pytest installed you can run the tests using:

./runtests.py

Otherwise you can run the tests executing:

python setup.py test

or

py.test

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

hansel-0.0.4.tar.gz (11.1 kB view details)

Uploaded Source

Built Distribution

hansel-0.0.4-py3-none-any.whl (13.8 kB view details)

Uploaded Python 3

File details

Details for the file hansel-0.0.4.tar.gz.

File metadata

  • Download URL: hansel-0.0.4.tar.gz
  • Upload date:
  • Size: 11.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No

File hashes

Hashes for hansel-0.0.4.tar.gz
Algorithm Hash digest
SHA256 90ba4aae91dc31d6b3e8ad3bec7f10942d6bed0c9a71d7c55749ee7336ad9a54
MD5 76e8cce920e0cf659057d106b11069e7
BLAKE2b-256 83accfd07ea646e546af476de01e3d73cfebbc58a7507b0ca8e30215dcbdd918

See more details on using hashes here.

File details

Details for the file hansel-0.0.4-py3-none-any.whl.

File metadata

File hashes

Hashes for hansel-0.0.4-py3-none-any.whl
Algorithm Hash digest
SHA256 e33e515bea34dcc4459cc7f56c7efb35c62fc6ee50274c560fdfc9b4afe563f1
MD5 9c4486897f739c0cd52de45201ecacf4
BLAKE2b-256 7c5579cbeac06531cf7447dc770489c5cc8cf097d6291d101cad2e52d4015d26

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page