22 Types of Three-address code, Representations and Declarations
Rajeswari Sridhar
In this module we would discuss the types of three-address code for the various programming constructs. We would also discuss how to store this three-address code information for later retrieval for code generation. As a first step, we would learn to write semantic rules for generating three-address code for declarations
22.1Types of Three-address code
In a three-address code there is at most one operator at the right side of an expression. The LHS of an expression will be a temporary variable or a permanent variable. Figure 22.1 are some examples of three-address code for expressions.
• t1 = b – c
• t2 = a * t1
• t3 = a + t2
• t4 = t1 * d
• t5 = t3 + t4
Figure 22.1 Example of expressions
Figure 22.1 has shown examples of three-address code involving expressions and we would like to know how the three-address code would look like for other programming constructs. In order to know this, the following are the various types of three-address code and all programming constructs need to use one or a combination of more than one to generate three-address code.
• x = y op z – This corresponds to expressions involving binary operators, where x, y, z are variables indicating addresses and ‘op’, ‘=’ are the two operators
• x = op y – This is similar to the previous one. It is also for expressions involving unary operators where x, y corresponds to variables and ‘op’ represents the unary operator along with the assignment operator.
• x = y – This is simply an assignment statement involving two variables x, y.
• goto L – This is an unconditional jump. L is the address of the location to jump.
• if x goto L and if (false x) goto L1 – This is a conditional jump. The variable ‘x’ is one address and the location of the conditional jump L is another address. If ‘x’ is true the sequence jumps to L and if false would jump to another address L1.
• if x relop y goto L – This is conditional jump involving two addresses, where ‘x’ is checked to jump to location L if true.
• Procedure calls using: – Procedure calls are done in sequence by evaluating the parameters and then using the values to call the necessary function which is at an address.
- param x
- call p,n
- y = call p,n
- x = y[i] and x[i] = y – Array assignments involve two addresses and an offset
- x = &y – Address of a variable ‘y’ is an address, and is assigned to another address location ‘x’.
- x = *y and *x =y – Assigning address of ‘y’ to a variable ‘x’ or assigning the value of ‘y’ to the dereferenced variable ‘x’
Consider an example code and let us discuss the possible three-address code.
- do i = i+1; while (a[i] < v);
The above statement is a do-while loop. The value of ‘i’ is incremented by one in the body of the loop. The value of an array variable is checked to continue or exit from the loop. The three-address code could be generated either using symbolic labels or using position line numbers and both are given in Table 22.1
As can be seen from table 22.1, all the instructions obey one of the types of three-address codes already mentioned. Now, the later sections of this module and the next module would discuss the work involved in generating this code from the input high level language. After generating the code, it has to be stored. The next section would discuss means of storing the generated 3-address code.
22.2Three-address code representation
Three-address code could be represented using quadruples, triples and indirect triples. All the representations use tables to store the information. The following subsection discusses these representations.
22.2.1 Quadruples
This has four fields in the table to represent op, arg1, arg2 and result where ‘op’ refers to the operator, ‘arg1’ and ‘arg2’ refers to the operands and the ‘result’ field stores the result of computing ‘arg1’ and ‘arg2’ on the operator ‘op’. Consider the following example :
a := b * – c + b * – c
The three-address code corresponding to the above example is given below:
t1 : = -c
t2 := b * t1
t3 : = – c
t4 := b * t3
t5 := t2 + t4
a := t5
The generated three-address is stored as a quadruple in Table 22.2
Table 22.2 Quadruple example
This uses a triple representation but the triples are remembered in another table for quick reference and reuse. We use a list of pointers to triples to remember this. For the example of Table 22.3, the list of pointers is given in Table 22.5.
Table 22.5 Indirect triple to maintain list of Triple pointers
Triple reference | |
(10) | (0) |
(11) | (1) |
(12) | (2) |
(13) | (3) |
(14) | (4) |
(15) | (5) |
22.3 SDT to Three-address code
After discussing how to store a three-address code representation, we would now discuss how to generate three-address code. Three-address code is constructed based on the grammar that is defined for each programming construct. The typical attributes based on which the three-address code is generated are code, place, and value. The code attribute corresponds to the actual code which would reflect in the final assembly code to perform some computation. The place attribute refers to the address to store a variable. The value refers to the result of the computation or the value of the variable or the constant.
Table 22.6 gives the semantic rules for the Expression grammar to generate three-address code.
Table 22.6 Example semantic rules to generate three-address code for expressions
which is the address of ‘c’ and‘d’. Using the given semantic rule, we need to create a new temporary variable for E2 and let us call that as t1. The code corresponding to the ‘*” node is generated using ‘gen’ as E2.addr ‘=’ E3.addr ‘*’ E4.addr and is given below
t1 = c * d
Now, t1 corresponds to the address of E2 and afte r generating code corresponding to E2, we need to generate code corresponding to E1. E1 is a single variable and hence address of E1 is ‘a’. Using the ‘gen’ we create a new address corresponding to E as t2 which is the address of the ‘+’ node. Thus the following code is generated for the ‘+’ node.
t2 = b + t1
After this we have the code corresponding to E in address t2. Thus we generate a new address for the LHS variable as t3 and we generate the code corresponding to it as
t3 = t2
Finally, for the root node, the code will be as
a : = t3
22.4 Declarations
Declarations can be in a procedure or part of the main program. Declarations of variable, constants are done in any procedure so that the variables could be used as part of the code. The main objective of generating three-address code for variables is to keep track of scope information, in addition to allocating a width to store the variable which is based on the data type. The scope information of a variable is maintained in the symbol table along with ot her information like data type, address, width, value, etc. Computing the address of variables is done by semantic rules related to three-address code. Type, width and offset are three attributes associated with generating three-address code for declarations. Table 22.7 gives the semantic rules corresponding to three-address code generation of declarations.
Table 22.7 Three-address code for declarations
The initial production is P à D For each procedure a new symbol table is created and a beginning of a procedure is indicated using the ‘proc’ keyword. The production for declarations looks like
D à D ; D | id = T | proc id ; D ; S
A new symbol table is created when proc id; D; S is encountered. The procedures in which all symbol tables are linked and kept track of are discussed in the next module along with their semantic rules.
Summary: In this module we discussed the various ways of representing Intermediate code representations like Quadruples, Triples, and Indirect Triples. Their procedure and advantages were discussed. We also discussed an example of semantic rules set to generate 3-address code for expressions and declarations in Pascal language. The subseq uent module would discuss the semantic rules for generating three- address code for other programming constructs.