Welcome to Software Development on Codidact!
Will you help us build our independent community of developers helping developers? We're small and trying to grow. We welcome questions about all aspects of software development, from design to code to QA and more. Got questions? Got answers? Got code you'd like someone to review? Please join us.
Readable syntax for executing many callables with useful side effects
In Python, multiprocessing is easy to do if you follow a "list projection" paradigm. Say you want to take a list of inputs X
and apply some function f
to every x_i
, such that y_i = f(x_i)
and the y_i
comprise the output list Y
:
y = multiprocessing.Pool().map(f, x)
Is there a good API to "take a list of functions F
, and for each f_i
run f_i()
"?
List comprehension also runs into this issue, and you end up having to do stuff like:
# Bare statement, but still worth executing for the side effects
[f_i() for f_i in F]
Multiprocessing pools look similarly silly:
# Must be defined in outer scope, otherwise you'll get pickling errors
def execute(f):
f()
multiprocessing.Pool().map(execute, F)
While these aren't terribly long or complex, they are somewhat confusing at first glance. I worry that I would need 1-3 lines of comments just to ensure that they'll be readable in a few years when I've forgotten it. Often, whenever you need a lot of comments to explain some code, it's a sign that there's a much better way to do it. Is there such a better way in Python?
Context
I want to have a loop which updates various Git repositories. There are differences in how various repos must be updated. Therefore, I want to have a loop that creates a function (no input, no output) tailored to the corner cases of each repo. Then use multiprocessing to actually execute those. This seems to make the actual business logic much simpler than trying to write a generic function that can handle all corner cases of every type of repo.However, I am interested in the general idea more than the specific case of handling repos.
1 answer
The map
operation is a typical concept from the functional programming paradigm.
However, side-effects are a typical example of something that does not fit functional programming well.
As a result, map
is probably not what you want to use.
As an alternative, you can just use the apply
method of the Pool
class as follows:
import multiprocessing
pool = multiprocessing.Pool
for f_i in f:
pool.apply(f_i)
Given that your context seems to be IO-bound, it might also be useful to consider thread pools instead of process pools. Threads are a bit more lightweight, but due to the GIL they are only useful for tasks where the CPU has to wait for IO anyway.
from concurrent.futures import ThreadPoolExecutor
with ProcessPoolExecutor() as pool:
for f_i in f:
pool.submit(f_i)
FYI: the concurrent.futures
package also has a ProcessPoolExecutor with the same interface.
0 comment threads