Python Docs

Multiprocessing

The multiprocessing module starts separate Python processes to achieve true parallelism for CPU-bound tasks, bypassing the Global Interpreter Lock (GIL). Each process has its own memory space.

Parallel map with Pool.map()

Use a Pool when you want to apply a function to many items in parallel, like a CPU-heavy transformation.

from multiprocessing import Pool

def f(x):
    return x * x

if __name__ == "__main__":  # Required on Windows
    with Pool(4) as p:
        print(p.map(f, [1, 2, 3, 4]))
  • The if __name__ == "__main__" guard is required on Windows to avoid infinite spawning.
  • Pool(4) creates 4 worker processes (use CPU count as a guideline).

Manual Process and Queue

For more control, you can create individual processes and share work via a queue.

from multiprocessing import Process, Queue

def worker(q):
    while not q.empty():
        item = q.get()
        print("processing", item)

if __name__ == "__main__":
    q = Queue()
    for i in range(5):
        q.put(i)

    ps = [Process(target=worker, args=(q,)) for _ in range(2)]

    for p in ps:
        p.start()
    for p in ps:
        p.join()

Note: q.empty() is not always 100% reliable in highly concurrent cases, but fine for simple examples and small scripts.

Shared State with Manager

A Manager creates proxy objects (like list/dict) that multiple processes can safely share and modify.

from multiprocessing import Manager, Process

def add(shared_list):
    shared_list.append(1)

if __name__ == "__main__":
    with Manager() as m:
        lst = m.list()
        ps = [Process(target=add, args=(lst,)) for _ in range(3)]

        for p in ps:
            p.start()
        for p in ps:
            p.join()

        print(list(lst))  # [1, 1, 1]
  • Manager objects are convenient but a bit slower (data goes through a server process).
  • Use them when simplicity matters more than raw speed.

ProcessPoolExecutor (high-level API)

The concurrent.futures API provides a nicer interface for pools, especially when mixing threads and processes.

from concurrent.futures import ProcessPoolExecutor

def fib(n):
    if n < 2:
        return n
    return fib(n - 1) + fib(n - 2)

if __name__ == "__main__":
    with ProcessPoolExecutor() as ex:
        results = list(ex.map(fib, [28, 29, 30]))
        print(results)

ex.map() behaves like built-in map, but runs each call in parallel processes.

When to Use Multiprocessing vs Threads

  • Use multiprocessing for CPU-bound work (heavy math, image processing, ML inference, etc.).
  • Use threading / async for I/O-bound work (network calls, disk I/O).
  • Don't spawn too many short-lived processes; process startup has overhead.

Best Practices

  • Always guard multiprocessing code with if __name__ == "__main__": (especially on Windows).
  • Only pass picklable objects between processes.
  • Use Pool / ProcessPoolExecutor for simple parallel "map" patterns.
  • Prefer threads or async for I/O-bound workloads; processes for CPU-bound.