Work-Span Model
Definitions
- Work \(W(n)\): Total number of operations (as if run on 1 processor)
- Span \(S(n)\): Length of longest dependency chain (critical path, as if infinite processors)
- Parallelism: \(\dfrac{W(n)}{S(n)}\) = max useful processors
- Efficient: polylogarithmic span and polylogarithmic overhead
- Optimal: polylogarithmic span and constant overhead
- Overhead: \(\dfrac{W_{\text{parallel}}(n)}{W_{\text{sequential}}(n)}\)
Brent's Theorem
\[
T_p \leq \frac{W}{p} + S
\]
With \(p\) processors, time is at most work divided by processors plus span. You can't do better than either \(\frac{W}{p}\) (work-limited) or \(S\) (span-limited).
How to Compute Work and Span of Recursive Algorithms
Recipe:
- Identify the recursive calls and how they divide the problem
- Identify the non-recursive work at each level
- Set up recurrences:
- Work: sum ALL work (recursive + local). E.g., \(W(n) = 2W\!\left(\frac{n}{2}\right) + O(n)\)
- Span: take MAX of parallel branches + sequential work. E.g., \(S(n) = S\!\left(\frac{n}{2}\right) + O(n)\) [if two calls are parallel, take max = one call]
- Solve recurrences (Master theorem or expansion)
Common Recurrence Solutions
| Recurrence | Solution |
|---|---|
| \(T(n) = 2T\!\left(\frac{n}{2}\right) + O(1)\) | \(O(n)\) |
| \(T(n) = 2T\!\left(\frac{n}{2}\right) + O(n)\) | \(O(n \log n)\) |
| \(T(n) = 2T\!\left(\frac{n}{2}\right) + O(\log n)\) | \(O(n)\) |
| \(T(n) = T\!\left(\frac{n}{2}\right) + O(1)\) | \(O(\log n)\) |
| \(T(n) = T\!\left(\frac{n}{2}\right) + O(n)\) | \(O(n)\) |
| \(T(n) = 3T\!\left(\frac{n}{2}\right) + O(n)\) | \(O\!\left(n^{\log_2 3}\right) \approx O\!\left(n^{1.58}\right)\) |
| \(T(n) = 3T\!\left(\frac{n}{3}\right) + O(n)\) | \(O(n \log n)\) |
| \(T(n) = T\!\left(\frac{n}{2}\right) + O\!\left(\sqrt{n}\right)\) | \(O\!\left(\sqrt{n}\right)\) [geometric series] |
Processor Count Questions
- "How many processors for Tina's algorithm to be worthwhile over Charles'?"
- Sequential: \(T_{\text{seq}}(n)\). Parallel with \(p\) processors: \(\frac{W}{p} + S\).
- Need \(\frac{W}{p} + S < T_{\text{seq}}(n)\). Solve for \(p\).
- Usually: \(p > \frac{W(n)}{T_{\text{seq}}(n)}\) to START being worthwhile
- "Maximum useful processors?"
- \(p_{\max} = \frac{W(n)}{S(n)}\) (parallelism). Beyond this, adding processors doesn't help.
Worked Example: 2024 Exam Q4 ("eagle" algorithm)
procedure eagle(ps)
n = length(ps)
if n == 0 { return 0 }
a = eagle(slice(ps, 0, n * 0.5)) -- recursive, size n/2
b = eagle(slice(ps, n * 0.5, n)) -- recursive, size n/2 (parallel with a)
c = 0
for (i = 0; i * i < n; i++) { -- sqrt(n) iterations
c += ps[i * i]
ps[i * i] += b
}
return c
- Span: \(S(n) = S\!\left(\frac{n}{2}\right) + O\!\left(\sqrt{n}\right)\) [two recursive calls in parallel → take max = \(S(n/2)\); loop is \(O(\sqrt{n})\)]
- \(S(n) = O\!\left(\sqrt{n}\right)\) (geometric series dominated by largest term)
- Work: \(W(n) = 2W\!\left(\frac{n}{2}\right) + O\!\left(\sqrt{n}\right)\)
- \(W(n) = O(n)\) (Master theorem: \(a=2,\; b=2,\; f(n)=\sqrt{n},\; n^{\log_b a} = n > \sqrt{n}\))
Worked Example: 2024 Retake Q5 ("erumpent" algorithm)
procedure erumpent(ps)
n = length(ps)
if n == 0 { return 0 }; if n == 1 { return ps[0] }
qs = map (\p -> p * n + 1) ps -- O(n) work, O(1) span
a = erumpent(slice(qs, 0, n * 0.5)) -- size n/2
b = erumpent(slice(qs, n*0.25, n*0.75)) -- size n/2
c = erumpent(slice(qs, n*0.5, n*1)) -- size n/2, all three parallel
return a + b + c
- Span: \(S(n) = S\!\left(\frac{n}{2}\right) + O(1)\) [three parallel calls → max = \(S(n/2)\); map is \(O(1)\) span]
- \(S(n) = O(\log n)\)
- Work: \(W(n) = 3W\!\left(\frac{n}{2}\right) + O(n)\)
- \(W(n) = O\!\left(n^{\log_2 3}\right) \approx O\!\left(n^{1.58}\right)\) (Master theorem: \(a=3,\; b=2,\; n^{\log_2 3} > n\))