Python Docs
Multiprocessing
The multiprocessing module starts separate Python processes to achieve true parallelism for CPU-bound tasks, bypassing the Global Interpreter Lock (GIL). Each process has its own memory space.
Parallel map with Pool.map()
Use a Pool when you want to apply a function to many items in parallel, like a CPU-heavy transformation.
from multiprocessing import Pool
def f(x):
return x * x
if __name__ == "__main__": # Required on Windows
with Pool(4) as p:
print(p.map(f, [1, 2, 3, 4]))- The
if __name__ == "__main__"guard is required on Windows to avoid infinite spawning. Pool(4)creates 4 worker processes (use CPU count as a guideline).
Manual Process and Queue
For more control, you can create individual processes and share work via a queue.
from multiprocessing import Process, Queue
def worker(q):
while not q.empty():
item = q.get()
print("processing", item)
if __name__ == "__main__":
q = Queue()
for i in range(5):
q.put(i)
ps = [Process(target=worker, args=(q,)) for _ in range(2)]
for p in ps:
p.start()
for p in ps:
p.join()Note: q.empty() is not always 100% reliable in highly concurrent cases, but fine for simple examples and small scripts.
Shared State with Manager
A Manager creates proxy objects (like list/dict) that multiple processes can safely share and modify.
from multiprocessing import Manager, Process
def add(shared_list):
shared_list.append(1)
if __name__ == "__main__":
with Manager() as m:
lst = m.list()
ps = [Process(target=add, args=(lst,)) for _ in range(3)]
for p in ps:
p.start()
for p in ps:
p.join()
print(list(lst)) # [1, 1, 1]- Manager objects are convenient but a bit slower (data goes through a server process).
- Use them when simplicity matters more than raw speed.
ProcessPoolExecutor (high-level API)
The concurrent.futures API provides a nicer interface for pools, especially when mixing threads and processes.
from concurrent.futures import ProcessPoolExecutor
def fib(n):
if n < 2:
return n
return fib(n - 1) + fib(n - 2)
if __name__ == "__main__":
with ProcessPoolExecutor() as ex:
results = list(ex.map(fib, [28, 29, 30]))
print(results)ex.map() behaves like built-in map, but runs each call in parallel processes.
When to Use Multiprocessing vs Threads
- Use multiprocessing for CPU-bound work (heavy math, image processing, ML inference, etc.).
- Use threading / async for I/O-bound work (network calls, disk I/O).
- Don't spawn too many short-lived processes; process startup has overhead.
Best Practices
- Always guard multiprocessing code with
if __name__ == "__main__":(especially on Windows). - Only pass picklable objects between processes.
- Use
Pool/ProcessPoolExecutorfor simple parallel "map" patterns. - Prefer threads or async for I/O-bound workloads; processes for CPU-bound.