21 TYPE CHECKING

Rajeswari Sridhar

In this module, we will discuss about the important function of the semantic phase of the compiler namely type checking. Type checking involves identifying and prompting if incompatible operands are being operated

21.1 Types of Check

The compiler needs to verify whether the source program follows the syntactic and semantic conventions. This is done with the help of static checking. Static checking helps in reporting programming errors during compile time. Dynamic checking is done during run time to identify and handle errors as they occur. The various static checking that are done by the compiler are listed as follows:

• Type Checking: This check determines and handles if an operator is applied to an

incompatible operand. Example: The compiler would prompt an error if array variable is added with function variable. Consider the following declarations and the statements that follow it.

1. int op(int), op(float);

2. int f(float);

3. int a, c[10], d;

4.	d = c+d;	// FAIL
5.	*d = a;	// FAIL
6.	a = op(d);	// OK: overloading (C++)
7.	a = f(d);	// OK: coercion
8.	vector<int> v;//OK: template instantiation

Line numbers 1 to 3 declares some variables and functions. Line number 4 adds an array variable (address) with an integer variable. This is prompted as an error. Line number 5 tries to dereference an integer variable and is prompted as an error. Line number 6 is accepted as it is similar to operator overloading. Line number 7 passes as integer to function ‘f’ which takes a float but implicit conversion can be considered and hence is an accepted statement. Similarly, line 8 is similar to a template instantiation and hence is an accepted type check.

• Flow of control check: This would verify whether the statements that results in a branch

are terminated correctly. Example: Break statements.

1. myfunc()

{ …

2. while (n)

3. { …

4. if (i>10)

5. break; // OK

}

6. myfunc1()

{ …

7. break; // ERROR

}

Consider the functions myfunc() and myfunc1(). myfunc() has a while loop and in the body of this while loop there is a if() conditional statement. The body of the if() is a break statement where nothing is done if the condition is met. This is an accepted control flow as only in the event that the if() statement is true, we come out of the body of the if() without doing anything. On the other hand, consider myfunc1() where there is no loop construct and we see a break. This is considered as error because if the program breaks the location to jump to is not defined.

• Uniqueness check: This ensures that an object must be defined exactly once in the situation that is demanded by some programming language. Example: labels in case statements need to be unique in pascal, identifiers need to be unique in programming languages.

myfunc2()

{ int i, j, i; // ERROR

…

}

The above function myfunc2() has two variables with the same name ‘i’ and hence is considered error.

• Name-related checks: This checks whether the same name appears more than once in programming languages that does not support. Example: In Ada, a name should not appear more than once and compiler needs to verify this. The following example has two arguments with the same name and hence is prompted as error.

cnufym(int a, int a) // ERROR

{ …

}

21. 2 Position of type checker

After understanding the types of static checking that are necessary, the compiler does type checking as against any other type of static checking. Typically type checking is being done after successfully parsing the input sentence. The position of the type checker is given in figure 21.1 where the syntax tree is used to verify the type checking information and is given later to the intermediate code generator for generating intermediate representation.

Thus as can be seen from figure 21.1 the type checker is part of the semantic phase of the compiler. Type checking information is added with the semantic rules. Basic type checking is performed which is further extended to type checking of complex attributes.

21.3 Type Checking

In general a language’s type system specifies the validity of the operation depending on the type of the operand. Type checking is done to ensure that operations with correct types are operated upon. Type systems define semantic rules to perform type checking. Type expressions can be defined as follows:

1. A basic type is a type expression. Integer, float, type_error all are type expression. Statements will have void as their type which is also a basic type expression.

2. A type constructor applied to a type expressions is a derived type expression

a. Arrays: If T is a type expression then array(A, T) is a type expression denoting the type of the array with elements of type T and index A

b. Products: If T1 and T2 are type expressions then their Cartesian product T1 X T2 is a type expression

c. Pointers: If T is a type expression then pointer(T) is a type expression indicating the type pointer to an object of type T

d. Functions: Functions gets value from some domain and maps it to a range. This mapping is a type

3. Type expressions contain variables which are also derived type expressions

After knowing what to check as type in each of the programming construct, now let us discuss the semantic rules that are necessary for type checking of the different programming constructs of Pascal language.

21.3.1 Type checking of Expressions

Expressions are to be compatible to be operated on. Every expression is governed by a data type and this ‘type’ is an attribute associated with it. For type checking of expressions, we will be having semantic rule that verifies this “type” attribute. Table 21.1 summarizes the semantic rule for type checking of expressions.

Table 21.1 Semantic rules for type checking expressions

21.3.2 Type checking of Statements

Statements can be simple assignment statements, conditional statements, sequence of statements or loops. The attribute of the statements is also type and the value of this is void if the statements are correct. The set of semantic rules to perform type checking of statements is given in Table 21.2.

21.3.3 Type checking of functions

Functions will do some processing and computations based on its parameters. The type checking of functions verifies whether the function uses arguments that have been passed and checks if there is a mapping between the arguments and the computations of the function. Table 21.3 summarizes the check.

Intermediate code can be represented in the following ways:

· Graphical representations: The input is represented as an abstract syntax tree (AST). We have already discussed the construction of the AST which is a binary tree in the previous modules. This AST is the output of the semantic phase of the compiler which can serve as an intermediate representation for generating target code. A directed acyclic graph can also be used to represent the AST. Consider the exa mple AST and its corresponding DAG representation in figure 21.3 for the input a:=b*-c + b*-c

The construction of the AST is already discussed in the previous module using the functions mknode() and mkleaf(). The construction of the DAG is the same as AST except for the fact that the common terms are represented only once in the graphical representation. The procedure to construct DAG for an expression is to initially, search the array for a node M with label op, left child l and right child r. If there is such a node, return the value number M. If not create in the array a new node N with label op, left child l, and right child r and return its value