Definitions & Concepts

Concurrency vs Parallelism

Concurrency: dealing with multiple things at once (composition of independent processes). Can exist on single core.
Parallelism: doing multiple things at once (simultaneous execution). Requires multiple cores/processors.

Task parallelism: different threads do different things. Explicit threading, locks, communication. Modest parallelism.
Data parallelism: same operation on many data elements. Implicit synchronization. Massive parallelism. Scales with data size.

Array of Structures: [{x,y,z}, {x,y,z}, ...] - logical but bad for SIMD/GPU (scattered access when processing one field)
Structure of Arrays: {xs: [...], ys: [...], zs: [...]} - better for vectorization and memory coalescing

\[ T_p \leq \frac{W}{p} + S \]

With \(p\) processors, time is at most work divided by processors plus span.

If fraction \(f\) of work is sequential, max speedup with \(p\) processors:

\[ \text{Speedup}(p) = \frac{1}{f + \frac{1-f}{p}} \]

As \(p \to \infty\), max speedup \(= \frac{1}{f}\).

If you scale the problem size with the number of processors instead of fixing the workload, the effective speedup can grow roughly like:

\[ \text{Speedup}(p) = p - \alpha (p-1) \]

where \(\alpha\) is the serial fraction.

Amdahl asks: "How much faster does the same workload get?"
Gustafson asks: "How much larger a workload can I solve in roughly the same time?"
For actual algorithm analysis in this course, the work-span model is usually more useful than Amdahl alone.