24 Arrays and Boolean Expressions
Rajeswari Sridhar
In this module we will learn to write semantic rules for arrays to generate three address code. We will also take a first look at the semantic rules for generating three-address code for Boolean expressions.
24.1 Semantic rules for Arrays
Arrays operation is part of any programming language. Arrays are typically used whenever a sequence of contiguous memory locations is required statically. Multi-dimension information is typically stored in arrays and hence accessing such multi-dimension array in a faster way becomes essential. This necessitates defining semantic rules for the arrays to generate three address code which will in-turn help faster access of an element in a multi-dimensional array. The following is the grammar for declaring arrays in Pascal. Let us consider this grammar to write semantic rules that generates three-address code for the arrays.
• S à L : = E – Assignment of expressions to include arrays. L is the variable to declare arrays.
• E à E + E | (E) – This is a simple expression where the operator is addition or parenthesis that defines precedence of an expression
• E à L – The RHS of the first production could also be an array variable and this production helps define a multi-dimensional array
• L à Elist] – This production closes an array dimension
• L à id – The expression RHS / LHS can be a single variable as in this or could be an array variable.
• Elist à Elist, E – This production is used to create multi-dimension arrays
• Elist à id [ E – This is the beginning of the array variable declaration where ‘id’ is the name of the array.
The array variable L has two attributes: place and offset. The attribute “place” is a pointer to temporary variable or any address while “offset” is to move through the array index. We have discussed the row-major and column- major ways of accessing the arrays in the previous module. Based on that, an n-dimension array can be generalized using the recursive expression defined as follows:
• em = em-1 + im
• e1 = i1
em refers to the mth dimension computation of address given the (m-1)th dimension. This is computed by knowing the address of the first dimension. After understanding the address computation to access the elements of a multi-dimensional array, the addresses are computing by defining the following functions. These functions are used to define the semantic rules to generate three-address code that will help access the array element:
- Elist.ndim – This is used to record the number of dimensions in the Elist
- limit(array, j) – This function returns the number of elements nj in the jth dimension of the array
- Elist.place – This is used to temporarily hold a value from index expression
Example 24.1 Let us see how to generate three-address code to access an array element. Consider the example with a two-dimension array
A : array [1..2,1..3] of integer;
Thus array A is a 2 x 3 array having two rows and three columns. Assume that this array is accessed in row- major manner. The following is what we would expect as the output given the request to access an element A[i,j].
t1 := i * 3 – columns of a row would be incremented before going to the next row and this is stored in a temporary variable.
t1 := t1 + j – accesses the jth column after completing i rows
t2 := c- base address is the address of the first element of the array
t3 := t1 * 4 – size of integer is 4 bytes which is the offset necessary from the base address
t4 := t2[t3] – base address t2 is offset by t3
- … := t4 – t4 has the address to fetch the array element
The above three-address code is based on the following set of equations as already discussed in Module 23.
Address = baseA + ((i1 – low1) * n2 + i2 – low2) * w
- = ((i1 * n2) + i2) * w + c
where c = baseA – ((low1 * n2) + low2) * w
with low1 = 1; low2 = 1; n2 = 3; w = 4
The semantic rules for generating three-address code for arrays is given in Table 24.1. The array variable ‘L’ has two attributes: offset and place. ‘offset’ refers to the shift that is necessary from the base address to compute the address of an element and ‘place’ refers to the temporary variable which will be used to offset.
The content is fetched from t3 by offsetting from the base address which is available at t2, which results in the following
- t4 := t2 [t3] This value is the RHS of the input and this is finally assigned to the LHS variable ‘x’ as follows:
- x := t4 Thus, using the semantic rules, any ‘n’ dimension array could be accessed by computing the corresponding address.
In addition to the semantic rules for arrays, type conversions are also part of them. These type conversions are similar to the ones discussed in the previous modules. As an example, consider the production
- E à E1 + E2 The corresponding semantic rule to set the type of the LHS variable E is given below
{E.type = if E1.type = int & E2.type = int then int; else real}
E.place = newtemp ();
If E1.type = int and E2.type = int emit (E.place ‘:=‘ E1.place ‘int +’ E2.place)
The first statement assigns the type to E and the subsequent statements are used to generate three-address code which is exactly similar to the rules discussed in the previous modules.
24.2 Boolean expressions
Logical values need to be computed to control the flow of statements. An expression is evaluated using relational operators, “<, >, <=, >=, ==, !=”. A particular branch instruction might take multiple branches and this is controlled by computing a logical value using the Boolean operators, “and”, “or” and “not”.
Consider the following expression involving relational and Boolean operators:
a < b or c < d and e < f
This could be derived as follows:
E => E1 or E2 => E1 or E4 and E3 => a < b or c < d and e < f.
Thus the string is derived by giving priority to “and” before executing “or”. Thus E4 and E3 need to be computed before computing E1 or E2. However, this is not handled in the semantic rules discussed earlier. The expression is evaluated and the corresponding ‘goto’ will take care of branching according to the ‘true’ or the ‘false’ of the expression.
24.3 Short circuit code
Boolean expression could generate three-address code without generating full code. The expression need not be evaluated in full. For example, if there are two expressions E1 and E2, and if it is connected using the ‘or’ operator, then if E1 is true then E2 need not be evaluated and thus no code need to be generated. A more detailed explanation of this will be discussed in the next module.
Summary:
This module discussed the three-address code for arrays and the semantic rules that are used to generate the three-address code for multi-dimensional arrays. In this module, the semantic rules that are used for generating three-address code for Boolean expressions involving Boolean and Relational operators are also discussed. Subsequent modules will discuss a better way of generating three-address code for Boolean expressions