A robust implementation of concurrent.futures.ProcessPoolExecutor
Project description
Reusable Process Pool Executor
Goal
The aim of this project is to provide a robust, cross-platform and
cross-version implementation of the ProcessPoolExecutor
class of
concurrent.futures
. It notably features:
-
Consistent and robust spawn behavior: All processes are started using fork + exec on POSIX systems. This ensures safer interactions with third party libraries. On the contrary,
multiprocessing.Pool
uses fork without exec by default, causing third party runtimes to crash (e.g. OpenMP, macOS Accelerate...). -
Reusable executor: strategy to avoid re-spawning a complete executor every time. A singleton executor instance can be reused (and dynamically resized if necessary) across consecutive calls to limit spawning and shutdown overhead. The worker processes can be shutdown automatically after a configurable idling timeout to free system resources.
-
Transparent cloudpickle integration: to call interactively defined functions and lambda expressions in parallel. It is also possible to register a custom pickler implementation to handle inter-process communications.
-
No need for
if __name__ == "__main__":
in scripts: thanks to the use ofcloudpickle
to call functions defined in the__main__
module, it is not required to protect the code calling parallel functions under Windows. -
Deadlock free implementation: one of the major concern in standard
multiprocessing
andconcurrent.futures
modules is the ability of thePool/Executor
to handle crashes of worker processes. This library intends to fix those possible deadlocks and send back meaningful errors. Note that the implementation ofconcurrent.futures.ProcessPoolExecutor
that comes with Python 3.7+ is as robust as the executor from loky but the later also works for older versions of Python.
Installation
The recommended way to install loky
is with pip
,
pip install loky
loky
can also be installed from sources using
python setup.py install
Note that loky
has an optional dependency on psutil
to allow early memory leak detections.
Usage
import os
from time import sleep
from loky import get_reusable_executor
def say_hello(k):
pid = os.getpid()
print("Hello from {} with arg {}".format(pid, k))
sleep(.01)
return pid
# Create an executor with 4 worker processes, that will
# automatically shutdown after idling for 2s
executor = get_reusable_executor(max_workers=4, timeout=2)
res = executor.submit(say_hello, 1)
print("Got results:", res.result())
results = executor.map(say_hello, range(50))
n_workers = len(set(results))
print("Number of used processes:", n_workers)
assert n_workers == 4
For more advance usage, see our documentation
Workflow to contribute
To contribute to loky, first create an account on github. Once this is done, fork the loky repository to have your own repository, clone it using 'git clone' on the computers where you want to work. Make your changes in your clone, push them to your github account, test them on several computers, and when you are happy with them, send a pull request to the main repository.
Running the test suite
To run the test suite, you need the pytest
(version >= 3) and psutil
modules. Run the test suite using:
pip install -e .
pytest .
from the root of the project.
Acknowledgement
This work is supported by the Center for Data Science, funded by the IDEX Paris-Saclay, ANR-11-IDEX-0003-02
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.