38 Code Improving Transformations

Rajeswari Sridhar

We will start this module by looking at use-define chains and define-use chains. We shall also discuss the algorithms for global common sub-expression, copy propagation and loop optimizations.

38.1 Define-Use and Use-Define (DU/UD) Chains

After understanding the various algorithms to identify reaching definitions, available expressions and live- variable analysis in the previous module, we will discuss yet another procedure that is helpful for optimization in this module using define-use and use-define chains. The define-use (du) and use-define (ud) chains are convenient way to access/use reaching definition information. Def-Use chains (DU chains) gives information of the possible consumers of the definition produced for a given a def. Use-Def chains (UD chains), accepts a use, and identifies all the possible producers of the definition consumed. Consider the example given in figure 38.1 having a control flow as depicted with variables r1 to r8.

Based on this UD and DU chains, we can identify Next-use information, available expressions and reaching definitions information for carrying out function preserving transformations.

38.2 Global Common Sub-expression elimination

The Available expression algorithm discussed in the previous module computes and informs whether an expression is common at point ‘p’. After identifying this, we need to get rid of the common sub-expression, both locally and globally. Algorithm 38.1 briefs on the procedure for global common sub-expression elimination.

Algorithm 38.1

• Input: A flow graph with available expression information

• Output: A revised flow graph

Eliminate_Common_sub_expression()

{

For every statement‘s’ of the form x:=y+z if y+z is available at the beginning of ‘s’ do the following

• Search backward from ‘s’ the expression ‘y+z’

• Create a new variable ‘u’

• Replace each statement w:=y+z found in the previous step by

• u := y+z

• w:= u

• Replace statement ‘s’ by x := u

}

Observations of the algorithm: Searching for ‘y+z’ can be done as a data-flow analysis problem. The replacement need not be optimized as it could result in copy statements. This algorithm does not handle the transitive nature of common sub-expressions. Consider the following two statements:

–

a := x+y vs c := x+y

–

b := a*z d := c*z

In the above set of statements, the value of ‘b’ and‘d’ are also essentially same and the above algorithm does not handle this situation. Consider the example of figure 38.2, to apply the common sub-expression elimination algorithm.

38.3 Copy Propagation

As we discussed in the previous module, copies gets generated due to elimination of common sub-expression. Consider the following statement ‘S’ that assigns the variable,

S: x:= y

After this assignment, the value of ‘x’ and ‘y’ will be exactly identical. So, we could get rid of variable ‘x’ and use ‘y’ in place of ‘x’. To carry out this, we initially determine where the value of ‘x’ is used and substitute ‘y’ in place of ‘x’ where ‘u’ is a statement that uses ‘y’. We need to ensure that statement ‘s’ must be the only definition of ‘x’ reaching ‘u’ and this could be checked using UD chains. We also need to ensure, that on every path from‘s’ to ‘u’, there are no assignments to ‘y’ which could be looked upon as a new data-flow analysis problem.

The procedure that is discussed to eliminate copy p ropagation can be looked upon as forward copy propagation procedure which looks at the RHS of an assignment statement. However, this does not consider the chain of dependencies and hence create dead code. As an example for copy propagation, figure 38.4 is used.

As given in given in figure 38.4, r1 and r2 have the same values. So, we replace r2 in place of r1. After performing copy propagation, the statement r1 := r2 remains as a dead code. The following are some of the data flow equations that could be defined and used to identify copy propagation. Here, c_gen[] and c_kill[] indicates the copies generated and killed in a basic block. Thus

out[B] = c_gen[B] U (in[B] – c_kill[B]) – Similar to reaching definitions
in[B] = ∩ out[P] where P is a predecessor block- Intersection of all the predecessor’s output is computed as the in[] of all but the first block.
in[B1] = Φ where B1 is the initial block

The algorithm is similar to available expressions algorithm and the computation are also same. Consider the example given in figure 38.5 to identify copy propagation.

• Determine whether for every use of ‘x’, ‘s’ is in c_in[B] where B is the block for this use and no definitions of ‘x’ or ‘y’ occur prior to this use

• If the statement meets this condition, remove ‘s’ and replace all uses of ‘x’ by ‘y’

}

Thus eliminating copy propagation would result in dead code.

38.4 Constant Propagation

Another important optimization to be considered is constant propagation. Similar to copy propagation, if a variable is assigned a constant and if the variable is used in subsequent expression, it can be replaced by the constant thus making the assignment of the constant to the variable as dead code. Consider the forward propagation of assignment of the form

d: rx := L where L is literal and rx is a variable.

Constant propagation involves replacement of “rx” with “L” wherever possible, provided the definition’s’ is available at point of replacement.

38.5 Unreachable Code Elimination

Copy propagation and constant folding could result in dead code. In addition, as discussed in peep hole optimization, some sections of code may be unreachable. This unreachable code need to be eliminated. To eliminate this, algorithm 38.3 is used.

Algorithm 38.3 – Unreachable code elimination.

Input: Flow graph

Output: Revised flow graph

Eliminate_Notreachable_code

{

Mark initial BB visited

to_visit = initial BB

while (to_visit not empty)

current = to_visit.pop()

for each successor block of current

Mark successor as visited;

to_visit += successor

endfor

endwhile

Eliminate all unvisited blocks

}

The algorithm carries out a depth- first search and tries to identify blocks that are useful. After identifying useful blocks, the other blocks available in the flow graph are identified as not useful

Consider figure 38.6, which involves 5 basic blocks. As can be seen, a depth- first search from the entry point would include bb1, then bb4 followed by bb2 and then bb5. The block bb3 is never to be reached and hence could be eliminated.

38.6 Loop invariant computation

The three major loop optimization techniques are strength reduction, code motion and induction variable elimination. Code motion involves eliminating loop invariant computation. UD-chains could be used to find out values that do not change as long as control stays within the loop. Loop has at least one way to get back to the header from any block in the loop. If x:= y+z is at a position in the loop and all possible definitions of ‘y’ and ‘z’ are outside the loop then y+z is loop invariant. Algorithm 38.4 discusses the loop invariant computation.

Algorithm 38.4 – Loop invariant computation

• Input: A loop L consisting of a set of basic blocks having three address statements. The set of three-address statements that compute the same value each time executed, from the time control enters the loop L until control leaves L

• Output: Optimized loops.

Compute_Loop_invariance()

{

• Mark the statements whose operands are all either constants or have all reaching definitions outside L as ‘invariant’

• Repeat the following step until at some repetition no new statements are marked ‘invariant’

• Mark ‘invariant’ all those statements not previously so marked all of whose operands are either constant or having definitions reaching outside ‘L’ or have only one reaching definition which is marked invariant.

}

38.7 Performing Code motion

After identifying the loop invariant statements, we have to eliminate repeated computation of the statements whose results are not going to change. The statements which are identified as loop invariant are moved to the pre-header of the loop. In order to apply code motion, certain conditions need to be checked and applied. Consider a statement, ‘s’ x: = y+z. The following are the conditions that need to be checked and applied.

• Block containing a statement ‘s’ must dominate all exit nodes of the loop, which is essentially a successor node that is not in the loop

• No other statement in the loop assigns to ‘x’

• No use of ‘x’ in the loop is reached by any definition of ‘x’ other than ‘s’.

Algorithm 38.5 implements code motion.

Algorithm 38.5 – Code motion.

• Input: a loop L with ud-chain information and dominator information

• Output: a revised loop with a pre-header and some statements moved to the pre-header (if any)

Identify_Eliminate_Codemotion

{

• Identify the loop invariant statements defining ‘x’

• For each statement ‘s’ defining ‘x’ find

• That it is in a block that dominates all exits of L

• That ‘x’ is not defined elsewhere in L

• All uses in L of x can only be reached using statement ‘s’

• Move, in the order found by loop invariant algorithm, each statement ‘s’ to a newly created pre-header

}

Code motion sometime results in some illegal and incorrect situations. Consider the figure 38.7 where the available flow graph is optimized.

In figure 38.7, there is a loop from B2 – B3 – B4 – B2. Thus there is an assignment to the variable ‘i’ to the value ‘2’. As per the algorithm, this could be identified as loop invariant computation and could be moved to the pre-header and contributing to block B6. However, this is incorrect as the value of ‘i’ gets assigned to ‘2’ if this path through B3 is followed. If an alternate path that does not go through B3 is followed the value of ‘i’ is ‘1’ and hence we have performed an incorrect operation. So, for this flow graph there is no loop invariant computation.

Summary: In this module, we looked at the algorithms for eliminating common sub-expression and copy propagation. We also looked at algorithms to identify unreachable code and elimination of the same. Code motion and loop invariant computation algorithms are also discussed with an example. In the next module, we will discuss additional loop optimization techniques.