pympipool - scale functions over multiple compute nodes using mpi4py
Project description
pympipool
Scale functions over multiple compute nodes using mpi4py
Functionality
Serial subtasks
Write a python test file like pool.py
:
import numpy as np
from pympipool import Pool
def calc(i):
return np.array(i ** 2)
with Pool(cores=2) as p:
print(p.map(fn=calc, iterables=[1, 2, 3, 4]))
You can execute the python file pool.py
in a serial python process:
python pool.py
>>> [array(1), array(4), array(9), array(16)]
Internally pympipool
uses mpi4py
to distribute the four calculation to two processors cores=2
.
MPI parallel subtasks
In addition, the individual python functions can also use multiple MPI ranks. Example ranks.py
:
from pympipool import Pool
def calc(i, comm):
return i, comm.Get_size(), comm.Get_rank()
with Pool(cores=4, cores_per_task=2) as p:
print(p.map(fn=calc, iterables=[1, 2, 3, 4]))
Here the user-defined function calc()
receives an additional input parameter comm
which represents the
MPI communicator. It can be used just like any other mpi4py.COMM
object. Here just the size Get_size()
and the rank Get_rank()
are returned.
Futures Interface
In additions to the map()
function pympipool
also implements the concurrent.futures
interface. As the
tasks are executed in a parallel subprocess using mpi4py, an additional call to the update function update()
is required. Example submit.py
:
import numpy as np
from time import sleep
from pympipool import Pool
def calc(i):
return np.array(i ** 2)
with Pool(cores=2) as p:
futures = [p.submit(calc, i=i) for i in [1, 2, 3, 4]]
print([f.done() for f in futures])
sleep(1)
p.update()
print([f.result() for f in futures if f.done()])
After the submission using the submit function submit()
the futures objects are not completed done()
. Following,
a short call of the sleep function the update function update()
synchronizes the local futures objects. Consequently,
the future objects are afterward completed done()
and the results can be received using the results function
result()
. The code above results in the following output:
python submit.py
>>> [False, False, False, False]
>>> [array(1), array(4), array(9), array(16)]
Installation
As pympipool
requires mpi
and mpi4py
it is highly recommended to install it via conda:
conda install -c conda-forge pympipool
Alternatively, it is also possible to pympipool
via pip
:
pip install pympipool
Changelog
0.4.0
- Update test coverage calculation.
- Add
flux-framework
integration. - Change interface to be compatible to
concurrent.futures.Executor
- not backwards compatible.
0.3.0
- Support subtasks with multiple MPI ranks.
- Close communication socket when closing the
pympipool.Pool
. tqdm
compatibility is updated from4.64.1
to4.65.0
pyzmq
compatibility is updated from25.0.0
to25.0.2
0.2.0
- Communicate via zmq rather than
stdin
andstdout
, this enables support formpich
andopenmpi
. - Add error handling to propagate the
Exception
, when it is raised by mapping the function to the arguments.
0.1.0
- Major switch of the communication interface between the serial python process and the mpi parallel python process.
Previously, functions were converted to source code using
inspect.getsource()
anddill
was used to convert the sourcecode to an binary blob which could then be transferred between the processes. In the new version, the function is directly pickled usingcloudpickle
ascloudpickle
supports both pickle by reference and pickle by value. Here the pickle by value functionality is used to pickle the functions which is be communicated. - The documentation is updated to reflect the changes in the updated version.
0.0.2
- output of the function which is mapped to the arguments is suppressed, as
stdout
interferes with the communication ofpympipool
. Consequently, the output ofprint
statements is no longer visible. - support for python 3.11 is added
mpi4py
compatibility is updated from3.1.3
to3.1.4
dill
compatibility is updated from0.3.5.1
to0.3.6
tqdm
compatibility is updated from4.64.0
to4.64.1
0.0.1
- initial release
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file pympipool-0.4.1.tar.gz
.
File metadata
- Download URL: pympipool-0.4.1.tar.gz
- Upload date:
- Size: 30.0 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.9.17
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | d031d0067183b6c297e59180b794581afb226c230d4ce66288150b6c24f0cf09 |
|
MD5 | 2f07daa78bfd43c6ca78f96adcf9570d |
|
BLAKE2b-256 | d7da8e82df23156a4d4ebf6eeb6d39bede4893f1d28c5ce7538f90030d9a6bc4 |
File details
Details for the file pympipool-0.4.1-py3-none-any.whl
.
File metadata
- Download URL: pympipool-0.4.1-py3-none-any.whl
- Upload date:
- Size: 16.6 kB
- Tags: Python 3
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/4.0.2 CPython/3.9.17
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 5d4022dd068bb10b86fca03253dc18b8a04c79534f821cc93909f55ceab0696e |
|
MD5 | fd3880c2dbb0699654ce1089427846d6 |
|
BLAKE2b-256 | 81a5a5547d079f2d9bcbf1fb1026a7c215120123642ca9db3b741f6bd4168968 |