## Register allocation Advanced Compiler Construction Michel Schinz – 2025-04-03 #### Register allocation #### Register allocation consists in: - rewriting a program that makes use of an unbounded number of virtual or pseudo-registers, - into one that only uses physical (machine) registers. Some virtual registers might have to be **spilled** to memory. #### Register allocation is done: - very late in the compilation process typically only instruction scheduling comes later, - on an IR very close to machine code. #### Setting the scene We will do register allocation on an RTL with: - n machine registers $R_0, \ldots, R_{n-1}$ (some with non-numerical indexes like the link register $R_{LK}$ ), - unbounded number of virtual registers v<sub>0</sub>, v<sub>1</sub>, ... Of course, virtual registers are only available before register allocation. #### Running example Euclid's algorithm to compute greatest common divisor. #### Calling conventions: - the arguments are passed in $R_1$ , $R_2$ , ... - the return address is passed in R<sub>LK</sub>, - the return value is passed in $R_1$ . #### In RTL ``` gcd: R_3 \leftarrow done if R_2 = 0 goto R_3 R_3 \leftarrow R_2 R_2 \leftarrow R_1 \% R_2 R_1 \leftarrow R_3 R_3 \leftarrow gcd ``` #### Register allocation example #### Before register allocation ``` gcd: V_0 \leftarrow R_{LK} V_1 \leftarrow R_1 V_2 \leftarrow R_2 loop: V_3 \leftarrow done if \ V_2 = 0 \ goto \ V_3 V_4 \leftarrow V_2 V_2 \leftarrow V_1 \% \ V_2 V_1 \leftarrow V_4 V_5 \leftarrow loop goto \ V_5 done: R_1 \leftarrow V_1 goto \ V_0 ``` R<sub>1</sub>, R<sub>2</sub>: parameters R<sub>LK</sub>: return address allocable registers: R<sub>1</sub>, R<sub>2</sub>, R<sub>3</sub>, R<sub>LK</sub> #### After register allocation ``` gcd: loop: R_3 \leftarrow done if R_2 = 0 goto R_3 R_3 \leftarrow R_2 R_2 \leftarrow R_1 \% R_2 R_1 \leftarrow R_3 R_3 \leftarrow loop goto R_3 done: goto R_{LK} ``` #### Allocation: $$V_0 \rightarrow R_{LK}$$ $V_1 \rightarrow R_1$ $V_2 \rightarrow R_2$ $V_3, V_4, V_5 \rightarrow R_3$ ### Techniques We will study two commonly used techniques: - 1. register allocation by graph coloring, which: - produces good results, - is relatively slow, - is therefore used mostly in batch compilers, - 2. linear scan register allocation, which: - produces average results, - is very fast, - is therefore used mostly in JIT compilers. Both are **global**: they allocate registers for a whole function at a time. # Technique #1: graph coloring #### Allocation by graph coloring #### Register allocation can be reduced to graph coloring: - 1. build the interference graph, which has: - one node per register real or virtual, - one edge between each pair of nodes whose registers are live at the same time. - 2. color the interference graph with at most K colors (K = number of available registers), so that all nodes have a different color than all their neighbors. #### Problems: - coloring is NP-complete for arbitrary graphs, - a K-coloring might not even exist. ### Interference graph example #### **Program** ``` gcd: V_0 \leftarrow R_{LK} V_1 \leftarrow R_1 V_2 \leftarrow R_2 loop: v₃ ← done if v_2=0 goto v_3 V_4 \leftarrow V_2 V_2 \leftarrow V_1 \% V_2 V_1 \leftarrow V_4 V_5 \leftarrow loop goto V<sub>5</sub> done: R_1 \leftarrow V_1 goto v₀ ``` ### Liveness {in}{out} ``` {R_1,R_2,R_{LK}}{R_1,R_2,V_0} {R_1,R_2,V_0}{R_2,V_0,V_1} \{R_2, V_0, V_1\}\{V_0-V_2\} \{v_0-v_2\}\{v_0-v_3\} \{v_0-v_3\}\{v_0-v_2\} \{V_0-V_2\}\{V_0-V_2,V_4\} \{V_0-V_2,V_4\}\{V_0-V_2,V_4\} \{v_0-v_2,v_4\}\{v_0-v_2\} \{v_0-v_2\}\{v_0-v_2,v_5\} \{v_0-v_2,v_5\}\{v_0-v_2\} \{v_0, v_1\}\{R_1, v_0\} {R_1, v_0}{R_1} ``` ### Coloring example #### Original prog. ``` gcd: V_0 \leftarrow R_{LK} V_1 \leftarrow R_1 V_2 \leftarrow R_2 loop: v₃ ← done if v_2=0 goto v_3 V_4 \leftarrow V_2 V_2 \leftarrow V_1 \% V_2 V_1 \leftarrow V_4 V_5 \leftarrow loop goto V<sub>5</sub> done: R_1 \leftarrow V_1 goto v₀ ``` #### Rewritten gcd: prog. $R_{LK} \leftarrow R_{LK}$ $R_1 \leftarrow R_1$ $R_2 \leftarrow R_2$ loop: $R_3 \leftarrow done$ if $R_2=0$ goto $R_3$ $R_3 \leftarrow R_2$ $R_2 \leftarrow R_1 \% R_2$ $R_1 \leftarrow R_3$ $R_3 \leftarrow loop$ goto R<sub>3</sub> done: $R_1 \leftarrow R_1$ goto R<sub>LK</sub> ### Coloring example #### Original prog. ``` gcd: V_0 \leftarrow R_{LK} V_1 \leftarrow R_1 V_2 \leftarrow R_2 loop: v<sub>3</sub> ← done if v_2=0 goto v_3 V_4 \leftarrow V_2 V_2 \leftarrow V_1 \% V_2 V_1 \leftarrow V_4 V_5 \leftarrow loop goto V<sub>5</sub> done: R_1 \leftarrow V_1 goto v₀ ``` #### Rewritten prog. gcd: $R_{LK} \leftarrow R_{LK}$ $-R_1 \leftarrow R_1$ $R_2 \leftarrow R_2$ loop: R<sub>3</sub> ← done if $R_2=0$ goto $R_3$ $R_3 \leftarrow R_2$ $R_2 \leftarrow R_1 \% R_2$ $R_1 \leftarrow R_3$ $R_3 \leftarrow loop$ goto R<sub>3</sub> done: $-R_1 \leftarrow R_1$ goto R<sub>LK</sub> ### Coloring example (2) #### Original prog. ``` gcd: V_0 \leftarrow R_{LK} V_1 \leftarrow R_1 V_2 \leftarrow R_2 loop: v₃ ← done if v_2=0 goto v_3 V_4 \leftarrow V_2 V_2 \leftarrow V_1 \% V_2 V_1 \leftarrow V_4 V_5 \leftarrow loop goto V<sub>5</sub> done: R_1 \leftarrow V_1 goto v₀ ``` #### Rewritten gcd: prog. $R_3 \leftarrow R_{LK}$ $R_{LK} \leftarrow R_1$ $R_1 \leftarrow R_2$ loop: $R_2 \leftarrow done$ if $R_1=0$ goto $R_2$ $R_2 \leftarrow R_1$ $R_1 \leftarrow R_{LK} \% R_1$ $R_{LK} \leftarrow R_2$ $R_2 \leftarrow loop$ goto R<sub>2</sub> done: $R_1 \leftarrow R_{LK}$ goto R<sub>3</sub> This second coloring is also correct, but produces worse code! **Coloring by simplification** is a heuristic technique to color a graph with K colors: - 1. find a node n with less than K neighbors, - 2. remove it from the graph, - 3. recursively color the simplified graph, - 4. color n with any color not used by its neighbors. What if there is no node with less than K neighbors? - a K-coloring might not exist, - but simplification is attempted nevertheless. Number of available colors (K): 3 Stack of removed nodes: Number of available colors (K): 3 Stack of removed nodes: 5 Number of available colors (K): 3 Stack of removed nodes: 5 2 Number of available colors (K): 3 Stack of removed nodes: 5 2 1 Number of available colors (K): 3 Stack of removed nodes: 5 2 1 3 Number of available colors (K): 3 Stack of removed nodes: 5 2 1 3 Number of available colors (K): 3 Stack of removed nodes: 5 2 1 Number of available colors (K): 3 Stack of removed nodes: 5 2 Number of available colors (K): 3 Stack of removed nodes: 5 Number of available colors (K): 3 Stack of removed nodes: # Spilling ### (Optimistic) spilling What if all nodes have K or more neighbors during simplification? A node n must be chosen to be **spilled** and its value stored in memory instead of in a register: - remove its node from the graph (assuming no interference between spilled value and other values), - recursively color the simplified graph as usual. Once recursive coloring is done, two cases: - 1. by chance, the neighbors of n do not use all the possible colors, n is not spilled, - 2. otherwise, n is really spilled. ### Spill costs Which node should be spilled? Ideally one: - whose value is not frequently used, and/or - that interferes with many other nodes. For that, compute the spill cost of a node n as: ``` cost(n) = (rw_0(n) + 10 rw_1(n) + ... + 10^k rw_k(n)) / degree(n) ``` #### where: - $rw_i(n)$ is the number of times the value of n is read or written in a loop of depth i, - degree(n) is the number of edges adjacent to n in the interference graph. Then spill the node with lowest cost. #### Spilling of pre-colored nodes The interference graph contains nodes corresponding to the physical registers of the machine: - they are said to be **pre-colored**, as their color is given by the machine register they represent, - they should never be simplified, as they cannot be spilled (they are physical registers!). #### Spilling example: costs ``` gcd: V_0 \leftarrow R_{LK} V_1 \leftarrow R_1 V_2 \leftarrow R_2 loop: v₃ ← done if v_2=0 goto v_3 V_4 \leftarrow V_2 V_2 \leftarrow V_1 \% V_2 V_1 \leftarrow V_4 V_5 \leftarrow loop goto V<sub>5</sub> done: R_1 \leftarrow V_1 goto v₀ ``` | node | rw <sub>0</sub> | rw <sub>1</sub> | deg. | cost | |----------------|-----------------|-----------------|------|------| | V <sub>0</sub> | 2 | 0 | 7 | 0.29 | | V <sub>1</sub> | 2 | 2 | 6 | 3.67 | | V <sub>2</sub> | 1 | 4 | 6 | 6.83 | | <b>V</b> 3 | 0 | 2 | 3 | 6.67 | | V <sub>4</sub> | 0 | 2 | 3 | 6.67 | | <b>V</b> 5 | 0 | 2 | 3 | 6.67 | $$cost = (rw_0 + 10 rw_1) / degree$$ ### Spilling example ### Consequences of spilling After spilling, rewrite the program to: - insert code just before the spilled value is read, to fetch it from memory, - insert code just after the spilled value is written, to write it back to memory. But: spilling code introduces new virtual registers, so register allocation must be redone! In practice, 1-2 iterations are enough in almost all cases. ### Spilling code integration #### Original program ``` gcd: V_0 \leftarrow R_{LK} V_1 \leftarrow R_1 V_2 \leftarrow R_2 loop: v₃ ← done if v_2 = 0 goto v_3 V_4 \leftarrow V_2 V_2 \leftarrow V_1 \% V_2 V_1 \leftarrow V_4 V_5 \leftarrow loop goto V<sub>5</sub> done: R_1 \leftarrow V_1 goto v₀ ``` #### Rewritten program ``` gcd: V_6 \leftarrow R_{LK} push v<sub>6</sub> V_1 \leftarrow R_1 V_2 \leftarrow R_2 loop: v<sub>3</sub> ← done if v_2 = 0 goto v_3 V_4 \leftarrow V_2 V_2 \leftarrow V_1 \% V_2 V_1 \leftarrow V_4 V_5 \leftarrow loop goto V<sub>5</sub> done: R_1 \leftarrow V_1 pop v<sub>7</sub> goto v<sub>7</sub> ``` #### New interference graph #### Final program ``` gcd: R_{LK} \leftarrow R_{LK} push R<sub>LK</sub> R_1 \leftarrow R_1 R_2 \leftarrow R_2 loop: R<sub>LK</sub> ← done if R_2 = 0 goto R_{LK} R_{LK} \leftarrow R_2 R_2 \leftarrow R_1 \% R_2 R_1 \leftarrow R_{LK} R_{LK} \leftarrow loop goto R<sub>LK</sub> done: R_1 \leftarrow R_1 goto R<sub>2</sub> ``` ### New interference graph #### Final program ``` gcd: -R_{LK} \leftarrow R_{LK} push R<sub>LK</sub> -R_1 \leftarrow R_1 -R_2 \leftarrow R_2 loop: R<sub>LK</sub> ← done if R_2 = 0 goto R_{LK} R_{LK} \leftarrow R_2 R_2 \leftarrow R_1 \% R_2 R_1 \leftarrow R_{LK} R_{LK} \leftarrow loop goto R<sub>LK</sub> done: R_1 \leftarrow R_1 goto R<sub>2</sub> ``` # Coalescing ### Coloring quality Two valid K-colorings of an interference graph are not necessarily equivalent: one can lead to a much shorter program than the other. Why? Because "move" instruction of the form $V_1 \leftarrow V_2$ can be removed if $v_1$ and $v_2$ end up being allocated to the same register (also holds when $v_1$ or $v_2$ is a real register). Goal: make this happen as often as possible. #### Coalescing If $v_1$ and $v_2$ do not interfere, a move instruction of the form $V_1 \leftarrow V_2$ can always be removed by replacing $v_1$ and $v_2$ by a new virtual register $v_{1\&2}$ . This is called **coalescing**, as the nodes of $v_1$ and $v_2$ in the interference graph coalesce into a single node. ## Coalescing issue Coalescing is not always a good idea! Might turn a graph that is K-colorable into one that isn't, which implies spilling. Therefore: use conservative heuristics. #### Coalescing heuristics **Briggs**: coalesce nodes $n_1$ and $n_2$ to $n_{1\&2}$ iff: $n_{1\&2}$ has less than K neighbors of significant degree (i.e. of a degree greater or equal to K), **George**: coalesce nodes $n_1$ and $n_2$ to $n_{1\&2}$ iff all neighbors of $n_1$ either: - already interfere with n<sub>2</sub>, or - are of insignificant degree. Both heuristics are: - safe: won't make a K-colorable graph uncolorable, - conservative: might prevent a safe coalescing. #### Heuristic #1: Briggs Briggs: coalesce nodes n<sub>1</sub> and n<sub>2</sub> to n<sub>1&2</sub> iff: $n_{1\&2}$ has less than K neighbors of significant degree (i.e. of a degree $\geq$ K), Rationale: - during simplification, all the neighbors of $n_{1\&2}$ that are of insignificant degree will be simplified; - once they are, $n_{1\&2}$ will have less than K neighbors and will therefore be simplifiable too. #### Heuristic #2: George George: coalesce nodes $n_1$ and $n_2$ to $n_{1\&2}$ iff all neighbors of $n_1$ either: - already interfere with n<sub>2</sub>, or - are of insignificant degree. #### Rationale: - the neighbors of n<sub>1&2</sub> will be: - 1. those of $n_2$ , and - 2. the neighbors of n<sub>1</sub> of insignificant degree, - the latter ones will all be simplified, - once they are, the graph will be a sub-graph of the original one. #### Coalescing example noninterfering, move-related nodes coalescing of $R_1$ and $v_1$ into $R_{1v}$ safe according to Briggs and George with K = 4 node of significant degree node of insignificant degree ## Coalescing example (2) coalescing of $R_2$ and $v_2$ into R<sub>2</sub>V safe according to Briggs and George with K = 4 ## Coalescing example (3) coalescing of $R_{LK}$ and $v_0$ into $R_{LKv}$ safe according to Briggs and George with K = 4 ## Coalescing example (3) coalescing of $R_{LK}$ and $v_0$ into $R_{LKv}$ safe according to Briggs and George with K = 4 # Putting it all together #### Iterated register coalescing Simplification and coalescing should be interleaved to get **iterated register** coalescing: - 1. Interference graph nodes are partitioned in two classes: move-related or not. - 2. Simplification is done on *not* move-related nodes (as move-related ones could be coalesced). - 3. Conservative coalescing is performed. - 4. When neither simplification nor coalescing can proceed further, some move-related nodes are **frozen** (marked as non-move-related). - 5. The process is restarted at 2. #### Iterated register coalescing ## Assignment constraints #### Assignment constraints Current assumption: a virtual register can be assigned to any free physical register. Not always true because of assignment constraints due to: - registers classes (e.g. integer vs. floating-point registers), - instructions with arguments or result in specific registers, - calling conventions. A realistic register allocator has to be able to satisfy these constraints. #### Register classes Most architectures have several register classes: - integer vs floating-point, - address vs data, - etc. To take them into account in a coloring-based allocator: introduce artificial interferences between a node and all pre-colored nodes corresponding to registers to which it *cannot* be allocated. #### Calling conventions How to deal with the fact that calling conventions pass arguments in specific registers? ``` At function entry, copy arguments to new virtual regs: ``` ``` fact: v_1 \leftarrow R_1 ; copy first argument to v_1 Before a call, load arguments in appropriate registers: ``` ``` R_1 \leftarrow v_2; load first argument from v_2 ``` CALL fact Whenever possible, these instructions will be removed by coalescing. #### Caller/callee-saved registers #### Calling conventions distinguish two kinds of registers: - caller-saved: saved by the caller before a call and restored after it, - **callee-saved**: saved by the callee at function entry and restored before function exit. #### Ideally: - virtual registers having to survive at least one call should be assigned to callee-saved registers, - other virtual registers should be assigned to caller-saved registers. How can this be obtained in a coloring-based allocator? #### Caller/callee-saved registers Caller-saved registers do not survive a function call. #### To model this: Add interference edges between all virtual registers live across at least one call and (physical) caller-saved registers. #### Consequence: Virtual registers live across at least one call won't be assigned to caller-saved registers. #### Therefore: They will either be allocated to callee-saved registers, or spilled! #### Saving callee-saved registers Callee-saved registers must be preserved by all functions, so: - copy them to fresh temporary registers at function entry, - restore them before exit. #### Saving callee-saved registers For example, if R<sub>8</sub> is callee-saved: ``` entry: v_1 \leftarrow R_8 \quad \text{; save callee-saved } R_8 \quad \text{in } v_1 \\ \dots \quad \text{; function body} \\ R_8 \leftarrow v_1 \quad \text{; restore callee-saved } R_8 \\ \text{goto } R_{LK} ``` If register pressure is low: - $R_8$ and $v_1$ will be coalesced, and - the two move instructions will be removed. If register pressure is high: - $v_1$ will be spilled, making $R_8$ available in the function (e.g. to store a virtual register live across a call). ## Technique #2: linear scan #### Linear scan The basic linear scan technique is very simple: - the program is linearized i.e. represented as a linear sequence of instructions, not as a graph, - a unique live range is computed for every variable, going from the first to the last instruction during which it is live, - registers are allocated by iterating over the intervals sorted by increasing starting point: each time an interval starts, the next free register is allocated to it, and each time an interval ends, its register is freed, - if no register is available, the active range ending last is chosen to have its variable spilled. #### Linear scan example Linearized version of GCD computation: #### Program ``` 1 gcd: V_0 \leftarrow R_{LK} 2 V_1 \leftarrow R_1 3 V_2 \leftarrow R_2 4 loop: V_3 \leftarrow done 5 if V_2 = 0 goto V_3 6 V_4 \leftarrow V_2 7 V_2 \leftarrow V_1 \% V_2 8 V_1 \leftarrow V_4 9 V_5 \leftarrow loop 10 goto V_5 11 done: R_1 \leftarrow V_1 12 goto V_0 ``` #### Live ranges ``` v<sub>0</sub>: [1+,12-] v<sub>1</sub>: [2+,11-] v<sub>2</sub>: [3+,10+] v<sub>3</sub>: [4+,5-] v<sub>4</sub>: [6+,8-] v<sub>5</sub>: [9+,10-] Notation: i+ entry of instr. i i- exit of instr. i ``` time active intervals allocation | time active intervals | allocation | |-----------------------|-----------------------| | 1+ [1+,12-] | $V_0 \rightarrow R_3$ | | time active intervals | allocation | |-----------------------|--------------------------------------------| | 1+ [1+,12-] | $V_0 \rightarrow R_3$ | | 2+ [2+,11-],[1+,12-] | $V_0 \rightarrow R_3, V_1 \rightarrow R_1$ | | | time active intervals | allocation | |---|-------------------------------|-----------------------------------------------------------------| | _ | 1+ [1+,12-] | $V_0 \rightarrow R_3$ | | _ | 2+ [2+,11-],[1+,12-] | $V_0 \rightarrow R_3, V_1 \rightarrow R_1$ | | | 3+ [3+,10+],[2+,11-],[1+,12-] | $V_0 \rightarrow R_3, V_1 \rightarrow R_1, V_2 \rightarrow R_2$ | | time active intervals | allocation | |---------------------------------------|-----------------------------------------------------------------------------------------| | 1+ [1+,12-] | $V_0 \rightarrow R_3$ | | 2+ [2+,11-],[1+,12-] | $V_0 \rightarrow R_3, V_1 \rightarrow R_1$ | | 3+ [3+,10+],[2+,11-],[1+,12-] | $V_0 \rightarrow R_3, V_1 \rightarrow R_1, V_2 \rightarrow R_2$ | | 4+ [4+,5-],[3+,10+],[2+,11-],[1+,12-] | $V_0 \rightarrow R_3, V_1 \rightarrow R_1, V_2 \rightarrow R_2, V_3 \rightarrow R_{LK}$ | | time active intervals | allocation | |---------------------------------------|-----------------------------------------------------------------------------------------| | 1+ [1+,12-] | $V_0 \rightarrow R_3$ | | 2+ [2+,11-],[1+,12-] | $V_0 \rightarrow R_3, V_1 \rightarrow R_1$ | | 3+ [3+,10+],[2+,11-],[1+,12-] | $V_0 \rightarrow R_3, V_1 \rightarrow R_1, V_2 \rightarrow R_2$ | | 4+ [4+,5-],[3+,10+],[2+,11-],[1+,12-] | $V_0 \rightarrow R_3, V_1 \rightarrow R_1, V_2 \rightarrow R_2, V_3 \rightarrow R_{LK}$ | | 6+ [6+,8-],[3+,10+],[2+,11-],[1+,12-] | $V_0 \rightarrow R_3, V_1 \rightarrow R_1, V_2 \rightarrow R_2, V_4 \rightarrow R_{LK}$ | | time active intervals | allocation | |----------------------------------------|-----------------------------------------------------------------------------------------| | 1+ [1+,12-] | $V_0 \rightarrow R_3$ | | 2+ [2+,11-],[1+,12-] | $V_0 \rightarrow R_3, V_1 \rightarrow R_1$ | | 3+ [3+,10+],[2+,11-],[1+,12-] | $V_0 \rightarrow R_3, V_1 \rightarrow R_1, V_2 \rightarrow R_2$ | | 4+ [4+,5-],[3+,10+],[2+,11-],[1+,12-] | $V_0 \rightarrow R_3, V_1 \rightarrow R_1, V_2 \rightarrow R_2, V_3 \rightarrow R_{LK}$ | | 6+ [6+,8-],[3+,10+],[2+,11-],[1+,12-] | $V_0 \rightarrow R_3, V_1 \rightarrow R_1, V_2 \rightarrow R_2, V_4 \rightarrow R_{LK}$ | | 9+ [9+,10-],[3+,10+],[2+,11-],[1+,12-] | $V_0 \rightarrow R_3, V_1 \rightarrow R_1, V_2 \rightarrow R_2, V_5 \rightarrow R_{LK}$ | Result: no spilling time active intervals allocation | time active intervals | allocation | |-----------------------|--------------------------| | 1+ [1+,12-] | $V_0 \rightarrow R_{LK}$ | | | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | |-----------------------------------------------------------------------------------------------------------------------------------------|---|---|---|---|---|---|---|---|---|----|----|----| | V <sub>0</sub> | | | | | | | | | | | | | | V <sub>0</sub> V <sub>1</sub> V <sub>2</sub> V <sub>3</sub> V <sub>4</sub> V <sub>5</sub> R <sub>1</sub> R <sub>2</sub> R <sub>LK</sub> | | | | | | | | | | | | | | V <sub>2</sub> | | | | | | | | | | | | | | V <sub>3</sub> | | | | | | | | | | | | | | V4 | | | | | | | | | | | | | | <b>V</b> 5 | | | | | | | | | | | | | | R <sub>1</sub> | | | | | | | | | | | | | | R <sub>2</sub> | | | | | | | | | | | | | | $R_{LK}$ | | | | | | | | | | | | | | time active intervals | allocation | |-----------------------|-----------------------------------------------| | 1+ [1+,12-] | V <sub>0</sub> →R <sub>LK</sub> | | 2+ [2+,11-],[1+,12-] | $V_0 \rightarrow R_{LK}, V_1 \rightarrow R_1$ | | | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | |-----------------------------------------------------------------------------------------------------------------------------------------|---|---|---|---|---|---|---|---|---|----|----|----| | V <sub>0</sub> | | | | | | | | | | | | | | V <sub>0</sub> V <sub>1</sub> V <sub>2</sub> V <sub>3</sub> V <sub>4</sub> V <sub>5</sub> R <sub>1</sub> R <sub>2</sub> R <sub>LK</sub> | | | | | | | | | | | | | | V <sub>2</sub> | | | | | | | | | | | | | | V <sub>3</sub> | | | | | | | | | | | | | | V4 | | | | | | | | | | | | | | <b>V</b> 5 | | | | | | | | | | | | | | R <sub>1</sub> | | | | | | | | | | | | | | R <sub>2</sub> | | | | | | | | | | | | | | $R_{LK}$ | | | | | | | | | | | | | | time active intervals | allocation | |-------------------------------|--------------------------------------------------------------------| | 1+ [1+,12-] | $V_{0}\rightarrow R_{LK}$ | | 2+ [2+,11-],[1+,12-] | $V_0 \rightarrow R_{LK}, V_1 \rightarrow R_1$ | | 3+ [3+,10+],[2+,11-],[1+,12-] | $V_0 \rightarrow R_{LK}, V_1 \rightarrow R_1, V_2 \rightarrow R_2$ | | | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | |-----------------------------------------------------------------------------------------------------------------------------------------|---|---|---|---|---|---|---|---|---|----|----|----| | V <sub>0</sub> | | | | | | | | | | | | | | V <sub>0</sub> V <sub>1</sub> V <sub>2</sub> V <sub>3</sub> V <sub>4</sub> V <sub>5</sub> R <sub>1</sub> R <sub>2</sub> R <sub>LK</sub> | | | | | | | | | | | | | | V <sub>2</sub> | | | | | | | | | | | | | | V <sub>3</sub> | | | | | | | | | | | | | | V4 | | | | | | | | | | | | | | <b>V</b> 5 | | | | | | | | | | | | | | R <sub>1</sub> | | | | | | | | | | | | | | R <sub>2</sub> | | | | | | | | | | | | | | $R_{LK}$ | | | | | | | | | | | | | | time active intervals | allocation | |-------------------------------|------------------------------------------------------------------------------------------------| | 1+ [1+,12-] | $V_0 \rightarrow R_{LK}$ | | 2+ [2+,11-],[1+,12-] | $V_0 \rightarrow R_{LK}, V_1 \rightarrow R_1$ | | 3+ [3+,10+],[2+,11-],[1+,12-] | $V_0 \rightarrow R_{LK}, V_1 \rightarrow R_1, V_2 \rightarrow R_2$ | | 4+ [4+,5-],[3+,10+],[2+,11-] | $V_0 \rightarrow S$ , $V_1 \rightarrow R_1$ , $V_2 \rightarrow R_2$ , $V_3 \rightarrow R_{LK}$ | | | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | |-----------------------------------------------------------------------------------------------------------------------------------------|---|---|---|---|---|---|---|---|---|----|----|----| | V <sub>0</sub> | | | | | | | | | | | | | | V <sub>0</sub> V <sub>1</sub> V <sub>2</sub> V <sub>3</sub> V <sub>4</sub> V <sub>5</sub> R <sub>1</sub> R <sub>2</sub> R <sub>LK</sub> | | | | | | | | | | | | | | V <sub>2</sub> | | | | | | | | | | | | | | V <sub>3</sub> | | | | | | | | | | | | | | V4 | | | | | | | | | | | | | | <b>V</b> 5 | | | | | | | | | | | | | | $R_1$ | | | | | | | | | | | | | | $R_2$ | | | | | | | | | | | | | | $R_{LK}$ | | | | | | | | | | | | | | time active intervals | allocation | |-------------------------------|------------------------------------------------------------------------------------------------| | 1+ [1+,12-] | $V_0 \rightarrow R_{LK}$ | | 2+ [2+,11-],[1+,12-] | $V_0 \rightarrow R_{LK}, V_1 \rightarrow R_1$ | | 3+ [3+,10+],[2+,11-],[1+,12-] | $V_0 \rightarrow R_{LK}, V_1 \rightarrow R_1, V_2 \rightarrow R_2$ | | 4+ [4+,5-],[3+,10+],[2+,11-] | $V_0 \rightarrow S$ , $V_1 \rightarrow R_1$ , $V_2 \rightarrow R_2$ , $V_3 \rightarrow R_{LK}$ | | 6+ [6+,8-],[3+,10+],[2+,11-] | $V_0 \rightarrow S$ , $V_1 \rightarrow R_1$ , $V_2 \rightarrow R_2$ , $V_4 \rightarrow R_{LK}$ | | | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | |-----------------------------------------------------------------------------------------------------------------------------------------|---|---|---|---|---|---|---|---|---|----|----|----| | V <sub>0</sub> | | | | | | | | | | | | | | V <sub>0</sub> V <sub>1</sub> V <sub>2</sub> V <sub>3</sub> V <sub>4</sub> V <sub>5</sub> R <sub>1</sub> R <sub>2</sub> R <sub>LK</sub> | | | | | | | | | | | | | | V <sub>2</sub> | | | | | | | | | | | | | | V <sub>3</sub> | | | | | | | | | | | | | | V4 | | | | | | | | | | | | | | <b>V</b> 5 | | | | | | | | | | | | | | $R_1$ | | | | | | | | | | | | | | $R_2$ | | | | | | | | | | | | | | R <sub>LK</sub> | | | | | | | | | | | | | | time active intervals | allocation | |-------------------------------|------------------------------------------------------------------------------------------------| | 1+ [1+,12-] | $V_{0}\rightarrow R_{LK}$ | | 2+ [2+,11-],[1+,12-] | $V_0 \rightarrow R_{LK}, V_1 \rightarrow R_1$ | | 3+ [3+,10+],[2+,11-],[1+,12-] | $V_0 \rightarrow R_{LK}, V_1 \rightarrow R_1, V_2 \rightarrow R_2$ | | 4+ [4+,5-],[3+,10+],[2+,11-] | $V_0 \rightarrow S$ , $V_1 \rightarrow R_1$ , $V_2 \rightarrow R_2$ , $V_3 \rightarrow R_{LK}$ | | 6+ [6+,8-],[3+,10+],[2+,11-] | $V_0 \rightarrow S$ , $V_1 \rightarrow R_1$ , $V_2 \rightarrow R_2$ , $V_4 \rightarrow R_{LK}$ | | 9+ [9+,10-],[3+,10+],[2+,11-] | $V_0 \rightarrow S$ , $V_1 \rightarrow R_1$ , $V_2 \rightarrow R_2$ , $V_5 \rightarrow R_{LK}$ | | | | Result: vo is spilled during its whole life time! #### Linear scan improvements The basic linear scan algorithm is very simple but still produces reasonably good code. It can be – and has been – improved in many ways: - the liveness information about virtual registers can be described using a sequence of disjoint intervals instead of a single one, - virtual registers can be spilled for only a part of their whole life time, - more sophisticated heuristics can be used to select the virtual register to spill, - etc.