Skip to main content

A tool for running programs on many inputs

Project description

Each is a small batch processing utility designed to run some command on each file in a directory and produce some output in another directory, with the ability to resume processing if interrupted. Think of it as a slightly idiosyncratic implementation of the map part of map/reduce, or a more robust version of the following bash script.

for f in $source/* ; do
    DEST=$destination/$(basename $f)
    mkdir -p $DEST
    $command < $f > $DEST/$out 2> $DEST/err
    echo $? > $DEST/status
done

Usage

Usage is:

each some-input-directory 'some command to run' --destination="output directory"

More advanced usage options are available from each --help.

Frequently Anticipated Questions

Why?

I have a bunch of experiments that are basically “run this long running task on each of these files” with the tasks having varying degrees of flakiness, and I kept finding myself writing bad versions of this, so I thought I would solve the problem once and for all.

Main features over the bash loop version:

  1. You don’t risk learning how to write more bash than you want to.

  2. It resumes from where it left off if you kill it.

  3. Automatic parallelism

  4. You get a cool progress bar.

  5. When I get around to writing better retry features you’ll get those for free.

How do I install it?

pip install each

Why isn’t this tested with Hypothesis?

Look, buddy, you’re lucky it’s tested at all.

(It probably will be at some point)

Should I use this?

Eh, maybe. I’m finding it pretty helpful but it may be very idiosyncratic to my usage.

If you try it and it doesn’t work for you, file an issue or make a PR. I’m happy for it to be generally useful but I don’t plan to sink a huge amount of time into supporting it.

Will you make it work on Python 2?

No.

Will you release it under a more permissive license?

Also no.

I don’t like these answers. What should I use instead?

I dunno. Maybe bashreduce?

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

each-0.0.5.tar.gz (5.1 kB view details)

Uploaded Source

File details

Details for the file each-0.0.5.tar.gz.

File metadata

  • Download URL: each-0.0.5.tar.gz
  • Upload date:
  • Size: 5.1 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: Python-urllib/3.6

File hashes

Hashes for each-0.0.5.tar.gz
Algorithm Hash digest
SHA256 afb3433b3fe8df7220a7487e70c04c68926b9670d5475cfabd9e339228a0ee6e
MD5 3afdd0a8111ff9a931c93d02dd24a75b
BLAKE2b-256 bcc3aa9efcda952b6f39e73d3ba834ef0534e8f058971ef00f9d9be63a1cc0a2

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page