python

2020-09-04 06:27

concurrency

core idea is that there are three classes of software

single thread single process (single core)
multiple threads multiple processes (2-8 cores)
distributed processing (9+ cores)

is growing as computers improve. (2) is shrinking and is not future proof for most scalability problems. (2) is for gamers.

Python supports concurrency, you just need to understand how to fit into the Python concurrency paradigms.

Threads
- pro: shared state
- con: also shared state (race conditions)
Processes
- pro: independence
- con: pickling / interprocess controll
Async
- pro: cheap and easy
- con: only for IO-bound tasks

asyncio

based on an event-loop (twisted, gevent, etc)
intelligently switch execution while the program is awaiting IO
Async switches are cheap because they are internally using generators
With explicit keywords (‘yield’, ‘await’) switching is cooperative and there is no risk of inconsistent state

threads

threads share state, but this can be tricky with race conditions
Threads switch on their own basically for free, so they must always assume they will be interrupted.
This is where the GIL comes in. It is a protection.
Would you rather have one simple lock, or many, many individual locks?

rules:

pick between locks or queues (queues are preferred because too many and code is serial)
thread before you fork

processes

not every process is parallelizeable
“putting 5 workers on making a baby does not give baby in 1 month”
amdahl’s law - there is a spectrum of theorhetical speedup with concurrency of tasks
on the scale of lawn mowing to baby making how parallelizeable is this task?