A tool for running programs on many inputs
Project description
Each is a small batch processing utility designed to run some command on each file in a directory and produce some output in another directory, with the ability to resume processing if interrupted. Think of it as a slightly idiosyncratic implementation of the map part of map/reduce, or a more robust version of the following bash script.
for f in $source/* ; do
DEST=$destination/$(basename $f)
mkdir -p $DEST
$command < $f > $DEST/$out 2> $DEST/err
echo $? > $DEST/status
done
Usage
Usage is:
each some-input-directory 'some command to run' --destination="output directory"
More advanced usage options are available from each --help.
Frequently Anticipated Questions
Why?
I have a bunch of experiments that are basically “run this long running task on each of these files” with the tasks having varying degrees of flakiness, and I kept finding myself writing bad versions of this, so I thought I would solve the problem once and for all.
Main features over the bash loop version:
You don’t risk learning how to write more bash than you want to.
It resumes from where it left off if you kill it.
Automatic parallelism
You get a cool progress bar.
When I get around to writing better retry features you’ll get those for free.
How do I install it?
pip install each
Why isn’t this tested with Hypothesis?
Look, buddy, you’re lucky it’s tested at all.
(It probably will be at some point)
Should I use this?
Eh, maybe. I’m finding it pretty helpful but it may be very idiosyncratic to my usage.
If you try it and it doesn’t work for you, file an issue or make a PR. I’m happy for it to be generally useful but I don’t plan to sink a huge amount of time into supporting it.
Will you make it work on Python 2?
No.
Will you release it under a more permissive license?
Also no.
I don’t like these answers. What should I use instead?
I dunno. Maybe bashreduce?
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
File details
Details for the file each-0.0.5.tar.gz
.
File metadata
- Download URL: each-0.0.5.tar.gz
- Upload date:
- Size: 5.1 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: Python-urllib/3.6
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | afb3433b3fe8df7220a7487e70c04c68926b9670d5475cfabd9e339228a0ee6e |
|
MD5 | 3afdd0a8111ff9a931c93d02dd24a75b |
|
BLAKE2b-256 | bcc3aa9efcda952b6f39e73d3ba834ef0534e8f058971ef00f9d9be63a1cc0a2 |