Loop Optimization
Loop optimization is the family of compiler transformations that restructure loops to reduce overhead, improve locality, or expose parallelism. Covered in ECE459 L18.
Why focus on loops?
Programs spend most of their time looping, especially loops with high iteration counts. Per-iteration wins multiply. Profiling tells you which loops matter.
Transformations
- Induction variable elimination: an induction variable varies per iteration. Remove redundant ones that are computable from a primary induction variable
- Scalar replacement: replace repeated
a[i]withtemp = a[i]and usetemp. Needs proof thata[i]doesn’t change between reads - Bounds check elimination: sane languages check array bounds. If the loop never exceeds them, drop the checks. Idiomatic Rust uses
IntoIteratorand avoids the question - Loop unrolling:
for i in &[1,2,3,4] { f(*i); }becomesf(0); f(1); f(2); f(3);. Less branching, exposes parallelism. Synergizes with software pipelining (overlap iterations) and SIMD - Loop interchange: swap nesting to match memory layout. Rust is row-major (
a[1][1]next toa[1][2]). Iterate with the fast-changing index innermost so each step reads contiguous memory. OpenGL is column-major - Loop fusion: two loops over
0..100merge into one. Saves loop overhead and reuses cache lines - Loop fission: inverse of fusion, sometimes better because loop bodies become simpler or more vectorizable
- Loop-invariant code motion (hoisting): pull
s = x * yout of the loop if neitherxnorychange
Counter-intuitive trick from L18
For a large enough matrix, iterating “wrong-way” (column-major over row-major storage) can be faster. You trigger all page faults up front, populating the cache, then proceed unimpeded. Used in some matrix-multiply implementations.
Compilers do many of these automatically when they can prove safety. Aliasing ambiguity is the usual blocker. Rust’s &mut uniqueness helps, though LLVM may not fully exploit it.