Skip to main content

pyiron_workflow - Graph-and-node based workflow tools.

Project description

pyiron_workflow

Binder License Codacy Badge Coverage Status Documentation Status

Anaconda Last Updated Platform Downloads

Overview

pyiron_workflow is a framework for constructing workflows as computational graphs from simple python functions. Its objective is to make it as easy as possible to create reliable, reusable, and sharable workflows, with a special focus on research workflows for HPC environments.

Nodes are formed from python functions with simple decorators, and the resulting nodes can have their data inputs and outputs connected.

By allowing (but not demanding, in the case of data DAGs) users to specify the execution flow, both cyclic and acyclic graphs are supported.

By scraping type hints from decorated functions, both new data values and new graph connections are (optionally) required to conform to hints, making workflows strongly typed.

Individual node computations can be shipped off to parallel processes for scalability. (This is a beta-feature at time of writing; the PyMPIExecutor executor from pympipool is supported and tested; automated execution flows to not yet fully leverage the efficiency possible in parallel execution, and pympipool's more powerful flux- and slurm- based executors have not been tested and may fail.)

Once you're happy with a workflow, it can be easily turned it into a macro for use in other workflows. This allows the clean construction of increasingly complex computation graphs by composing simpler graphs.

Nodes (including macros) can be stored in plain text as python code, and registered by future workflows for easy access. This encourages and supports an ecosystem of useful nodes, so you don't need to re-invent the wheel. (This is a beta-feature, with full support of FAIR principles for node packages planned.)

Executed or partially-executed graphs can be stored to file, either by explicit call or automatically after running. When creating a new node(/macro/workflow), the working directory is automatically inspected for a save-file and the node will try to reload itself if one is found. (This is an alpha-feature, so it is currently only possible to save entire graphs at once and not individual nodes within a graph, all the child nodes in a saved graph must have been instantiated by Workflow.create (or equivalent, i.e. their code lives in a .py file that has been registered), and there are no safety rails to protect you from changing the node source code between saving and loading (which may cause errors/inconsistencies depending on the nature of the changes).)

Example

pyiron_workflow offers a single-point-of-entry in the form of the Workflow object, and uses decorators to make it easy to turn regular python functions into "nodes" that can be put in a computation graph.

Nodes can be used by themselves and -- other than being "delayed" in that their computation needs to be requested after they're instantiated -- they feel an awful lot like the regular python functions they wrap:

>>> from pyiron_workflow import Workflow
>>>
>>> @Workflow.wrap.as_function_node()
... def add_one(x):
...     return x + 1
>>>
>>> add_one(add_one(add_one(x=0)))()
3

But the intent is to collect them together into a workflow and leverage existing nodes. We can directly perform (many but not quite all) python actions natively on output channels, can build up data graph topology by simply assigning values (to attributes or at instantiation), and can package things together into reusable macros with customizable IO interfaces:

>>> from pyiron_workflow import Workflow
>>> Workflow.register("pyiron_workflow.node_library.plotting", "plotting")
>>>
>>> @Workflow.wrap.as_function_node()
... def Arange(n: int):
...     import numpy as np
...     return np.arange(n)
>>>
>>> @Workflow.wrap.as_macro_node("fig")
... def PlotShiftedSquare(self, n: int, shift: int = 0):
...     self.arange = Arange(n)
...     self.plot = self.create.plotting.Scatter(
...         x=self.arange + shift,
...         y=self.arange**2
...     )
...     return self.plot
>>> 
>>> wf = Workflow("plot_with_and_without_shift")
>>> wf.n = wf.create.standard.UserInput()
>>> wf.no_shift = PlotShiftedSquare(shift=0, n=wf.n)
>>> wf.shift = PlotShiftedSquare(shift=2, n=wf.n)
>>> 
>>> diagram = wf.draw()
>>> 
>>> out = wf(shift__shift=3, n__user_input=10)

Which gives the workflow diagram

And the resulting figure (when axes are not cleared)

Installation

conda install -c conda-forge pyiron_workflow

To unlock the associated node packages and ensure that the demo notebooks run, also make sure your conda environment has the packages listed in our notebooks dependencies

Learning more

Check out the demo notebooks, read through the docstrings, and don't be scared to raise an issue on this GitHub repo!

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

pyiron_workflow-0.8.0.tar.gz (115.2 kB view details)

Uploaded Source

Built Distribution

pyiron_workflow-0.8.0-py3-none-any.whl (137.8 kB view details)

Uploaded Python 3

File details

Details for the file pyiron_workflow-0.8.0.tar.gz.

File metadata

  • Download URL: pyiron_workflow-0.8.0.tar.gz
  • Upload date:
  • Size: 115.2 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/5.0.0 CPython/3.12.3

File hashes

Hashes for pyiron_workflow-0.8.0.tar.gz
Algorithm Hash digest
SHA256 04398a232ac9450ca4a47adb3ed1ba845b951ad1e8ab3c22c7bbb284d6ec2c78
MD5 6e0bdc611f2cdde6b5fc890a313a238a
BLAKE2b-256 06fd034068647e5fd1c50a143c9c5d7ef05577d1c598359abe6246c0617f62e2

See more details on using hashes here.

File details

Details for the file pyiron_workflow-0.8.0-py3-none-any.whl.

File metadata

File hashes

Hashes for pyiron_workflow-0.8.0-py3-none-any.whl
Algorithm Hash digest
SHA256 6334e534a667b349a70d0f0b28e5a16bb232cc8d255fe98d6e53c8f7176559ab
MD5 d12f095a5879954225ea4b0310d8546a
BLAKE2b-256 670cd50d6aa02995a6465a98a16d9356beb817f3d3aeb3d120ccd3137d557d20

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page