WDL launcher for Amazon Omics
Project description
miniwdl-omics-run
This command-line tool makes it easier to launch WDL workflow runs on Amazon Omics. It uses miniwdl locally to register WDL workflows with the service, validate command-line inputs, and start a run.
pip3 install miniwdl-omics-run
miniwdl-omics-run \
--role-arn {SERVICE_ROLE_ARN} \
--output-uri s3://{BUCKET_NAME}/{PREFIX} \
{MAIN_WDL_FILE} input1=value1 input2=value2 ...
Quick start
Prerequisites: up-to-date AWS CLI installed locally, and configured with full AdministratorAccess to your AWS account.
S3 bucket
Create an S3 bucket with a test input file.
AWS_ACCOUNT_ID=$(aws sts get-caller-identity --query Account --output text)
AWS_DEFAULT_REGION=$(aws configure get region)
aws s3 mb --region "$AWS_DEFAULT_REGION" "s3://${AWS_ACCOUNT_ID}-${AWS_DEFAULT_REGION}-omics"
echo test | aws s3 cp - s3://${AWS_ACCOUNT_ID}-${AWS_DEFAULT_REGION}-omics/test/test.txt
Service role
Create an IAM service role for your Omics workflow runs to use (to access S3, ECR, etc.).
aws iam create-role --role-name poweromics --assume-role-policy-document '{
"Version":"2012-10-17",
"Statement":[{
"Effect":"Allow",
"Action":"sts:AssumeRole",
"Principal":{"Service":"omics.amazonaws.com"}
}]
}'
aws iam attach-role-policy --role-name poweromics \
--policy-arn arn:aws:iam::aws:policy/PowerUserAccess
WARNING: PowerUserAccess, suggested here only for brevity, is far more powerful than needed. See Omics docs on service roles for the least privileges necessary, especially if you plan to use third-party WDL and/or Docker images.
ECR repository
Create an ECR repository suitable for Omics to pull Docker images from.
aws ecr create-repository --repository-name omics
aws ecr set-repository-policy --repository-name omics --policy-text '{
"Version": "2012-10-17",
"Statement": [{
"Sid": "omics workflow",
"Effect": "Allow",
"Principal": {"Service": "omics.amazonaws.com"},
"Action": [
"ecr:GetDownloadUrlForLayer",
"ecr:BatchGetImage",
"ecr:BatchCheckLayerAvailability"
]
}]
}'
Push a plain Ubuntu image to the repository.
ECR_ENDPT="${AWS_ACCOUNT_ID}.dkr.ecr.${AWS_DEFAULT_REGION}.amazonaws.com"
aws ecr get-login-password | docker login --username AWS --password-stdin "$ECR_ENDPT"
docker pull ubuntu:22.04
docker tag ubuntu:22.04 "${ECR_ENDPT}/omics:ubuntu-22.04"
docker push "${ECR_ENDPT}/omics:ubuntu-22.04"
Run test workflow
pip3 install miniwdl-omics-run
miniwdl-omics-run \
--role-arn arn:aws:iam::${AWS_ACCOUNT_ID}:role/poweromics \
--output-uri "s3://${AWS_ACCOUNT_ID}-${AWS_DEFAULT_REGION}-omics/test/out" \
https://raw.githubusercontent.com/miniwdl-ext/miniwdl-omics-run/main/test/TestFlow.wdl \
input_txt_file="s3://${AWS_ACCOUNT_ID}-${AWS_DEFAULT_REGION}-omics/test/test.txt" \
docker="${AWS_ACCOUNT_ID}.dkr.ecr.${AWS_DEFAULT_REGION}.amazonaws.com/omics:ubuntu-22.04"
This zips up the specified WDL, registers it as an Omics workflow, validates the given inputs, and starts the workflow run.
The WDL source code may be set to a local filename or a public HTTP(S) URL. The tool automatically bundles any WDL files imported by the main one. On subsequent invocations, it'll reuse the previously-registered workflow if the source code hasn't changed.
The command-line interface accepts WDL inputs using the input_key=value
syntax exactly like miniwdl run
, including the option of a JSON file with --input FILE.json
. Each input File must be set to an existing S3 URI accessible by the service role.
Advice
- Omics can use Docker images only from your ECR in the same account & region.
- This often means pulling, re-tagging, and pushing images as illustrated above with
ubuntu:22.04
. - And editing any WDL tasks that hard-code public registries in their
runtime.docker
. - Each ECR repository must have the Omics-specific repository policy set as shown above.
- We therefore tend to use a single ECR repository for multiple Docker images, disambiguating them using lengthier tags.
- If you prefer to use per-image repositories, just remember to set the repository policy on each one.
- This often means pulling, re-tagging, and pushing images as illustrated above with
- To quickly list a workflow's inputs, try
miniwdl run workflow.wdl ?
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file miniwdl-omics-run-0.2.0.tar.gz
.
File metadata
- Download URL: miniwdl-omics-run-0.2.0.tar.gz
- Upload date:
- Size: 9.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.11.2
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 49727a26403025b0632edb238cdc500cacf593785ee454241e58df6351341809 |
|
MD5 | 15f3fd6d946ada36d0344203517c2753 |
|
BLAKE2b-256 | ee52b20c1a2df8c02cea434a69eb836474b01a170b16856f487162867ab18678 |
Provenance
File details
Details for the file miniwdl_omics_run-0.2.0-py3-none-any.whl
.
File metadata
- Download URL: miniwdl_omics_run-0.2.0-py3-none-any.whl
- Upload date:
- Size: 8.5 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.11.2
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | d9c0657c453d23c1ed38cf2f92606c663d31b44b801e25bceb5dc90c901482ec |
|
MD5 | 4c43263526c151170a88366eb2b5d12c |
|
BLAKE2b-256 | ca4176a27ef3ab7627a2ce81bca97434fd5c81981ade41fd363d13eff70c2219 |