34 GNU Compiler Collection

Dr Narayan Joshi

Introduction

We know that for achieving desired task from computer, it is required to feed appropriate instructions to the computer. Such a set of appropriate instructions is known as computer program. The act of writing or creating computer program is called computer programming. However, the computer system being an electronic machine, such instructions a.k.a. computer program must be given in the machine understandable format. Upon assigning the program for execution to the computer system, the processing unit inside the computer understands and executes the program written in machine level language.

Therefore, it designates that the computer programmer (a person who writes computer programs) must know how to write computer program using machine understandable language in order to create successful computer program and thereby get the desired work done from computer. Vice versa, the programmer must be able to read such machine understandable instructions. Gradually the high-level languages evolved for helping programmer community expedite the programming process. Programs written using high-level languages are readable by human.

However, computers cannot understand the programs written using high-level languages. Hence, some mechanism is required which can transform the programs written in high-level languages to the machine understandable code also known as object code. Programming language experts have successfully built such software programs called compiler which can translate the high-level language programs (source code) to the machine understandable object code. Compiler is a set of computer programs that generate machine understandable object code from the given source code written in a particular programming language. The compiler software performs additional many important activities also, which are described in upcoming section.

For example, suppose a programmer has written a computer program using C language to implement simple mathematical calculator. The C program file which is written by the programmer is called source file. In this case, the C source file for the calculator program cannot be directly submitted to CPU for execution as CPU does not understand C language. Hence, next step is to generate machine understandable object code from the C source file using suitable compiler program which knows how to translate C source code into object code. On successful generation of the binary calculator program, it can be submitted to CPU for execution. The same case should be considered for other programming languages such as C++, Java etc.

An overview of general compilation process is explained in this module. The module explains compilation procedure for C and C++ programs. The module also explained some of the commonly used compilation options with the gcc compiler. How to generate some intermediate files – is also explained in this module.

Overview of compilation process

Program compilation involves multiple stages. There are four essential phases of compiling a C/C++ program. Often, the awareness about program compilation process becomes beneficial to programmer in debugging programs and writing efficient programs.

C/C++ program compilation involves the following stages (Figure 1):

Preprocessing
Compilation
Assembly
Linking

Modern compilers coordinate the implicit execution of all four stages mentioned above during program compilation.

Let us understand purpose of each of the above mentioned compilation stages for the following

C Filename: cprogram.c

#include <stdio.h>

int main()

{

printf(“Dennis Ritchie Sir.\n”);

return 0;

}

Preprocessing

Before the program gets actually translated to object code, the compilation process invokes preprocessing phase, also called first pass of program compilation. It operates macros, “# include” – files and conditional compilation for transforming the brief abbreviations into actual C/C++ – constructs.

For example, all lines starting with the # symbol in the program are processed by the preprocessor software:

(a) #include <stdio.h> //include header files

#include <math.h>

#include <iostream>

(b) #define SIMPLE_INTEREST_RATE 3.14 // macro defined #define ARRAY_SIZE 10

#define PASS_CRITERIA 50

printf(“Minimum percentage required to pass is not defined.”);

#endif

The preprocessor software treats the above mentioned lines as preprocessor commands and replaces – the macros (#define), include files (#include) and conditional compilation codes (#ifndef, #ifdef, #endif) – by their respective source codes and values in the program source file. The compiler submits the resultant file to the next compilation phase. Few lines of the output file generated by the preprocessor are shown in Figure 2.

Compilation

It is second phase of overall compilation process. Also known as a second pass, it is responsible to generate assembler source code if the source file does not contain syntax errors. In this phase, the preprocessed code (previous phase) is transformed into target platform specific human readable assembly-language instructions.Output assembly code for our sample program is shown in following Figure 3.

Assembly

In the assembly stage, the assembly instructions (generated during compilation stage) are translated into object code i.e. machine code. The assembly language to machine language conversion is done by the „assembler‟ software named „as’. In fact the assembler software is a separate program, but it is internally called automatically during the program compilation process. The resultant file contains the actual executable instructions by the machine.

Linking

Final stage of program compilation is linking. In this phase, a special software known as linker links various object files (generated in the assembly stage) to produce a final executable file.

In our example program cprogram.c file, a function printf() is used for printing a message on screen. However, the printf() function is not a user-defined function, it is available in a pre-compiled object file named printf.o. Therefore, it is required to merge the contents of the two object files cprogram.o and printf.o for generating the final executable file cprogram.x, the task is done by the linker software named „ld’. Like the assembler software, the linker software ld is a separate program, but it is internally called automatically during program compilation process.

GNU Compiler Collection (GCC)

The abbreviation GCC initially stood for “GNU C Compiler”. It was released in 1987 for compiling C programs. It was developed by GNU Project founder Richard Stallman. However, now it officially stands for “GNU Compiler Collection”. The collection offers software for compilation of programs written using various programming languages such as C (gcc), C++ (g++), Objective C (gobjc), Objective C++ (gobjc++), java (gcj), Fortran (gfortran), Ada (gnat), Go (gccgo) and BRIG.

Presently GCC is designed for several types of instruction set architectures (ISAs). Moreover, GCC is official compiler for the UNIX-like GNU operating system and hence it has been adopted as standard compiler by UNIX & LINUX family operating systems. GCC is freely available under the GNU GPL license.

This module explains how to compile programs written in C & C++ languages using gcc & g++ compilers respectively. In fact, the gcc & g++ compilers are very rich in features. Some of the frequently used features and their usage are explained in this module.

Installing gcc & g++

Use command „gcc –-version‟ to check availability of the gcc compiler in your Linux system. The output shown in following figure 4 you will receive if the gcc compiler is available in your Linux system.

If the gcc compiler is not available in your system, then you will receive an error message: “bash: gcc: command not found…”. In same way the “g++ –version” command may be used to check availability of the g++ compiler.

Installing gcc & g++ on Fedora/CentOS systems

Use following command on Fedora/CentOS Linux system for installing gcc and g++ compilers:

$ sudo yum install gcc gcc-c++

Installing gcc & g++ on Ubuntu systems

Use following command on Ubuntu Linux system for installing gcc and g++ compilers:

$ sudo yum apt-get install gcc g++

Getting help about gcc

Moreover, help about gcc can be obtained using „gcc –help‟ command. Use „man gcc‟ command to access man pages for the gcc compiler.

Compiling C program using gcc

This section explains how to compile a C program cprogram.c shown in following figure 5.

The gcc command is used to invoke the GCC C-compiler for compiling C programs. Provide source filename as command line argument to the gcc compiler.

$ gcc cprogram.c

However, notice that the program shown in Figure 5 contains an error on line 5. The statement shown on line 5 should end with the „;‟ symbol. Hence, running gcc command on the erroneous cprogram.c reports an error message and line number, shown in following Figure 6.

On successful compilation, gcc generates the executable file a.out. Execution output is shown in figure 7.

On successful compilation, gcc assigns a default name „a.out‟ to the resultant binary executable file. However, programmer may create the binary executable having file name of his choice. Use –o option and supply the desired file name for the resulting binary executable. In the example shown in figure 8, the gcc compiles source file cprogram.c to machine executable object code and stores it in executable file cprogram.exe.

Compiling C++ program using g++

Use g++ command to compile C++ programs. Consider following cppprogram.cc:

File: cppprogram.cc

using namespace std;

#include<isotream>

int main()

{

cout<<”This is C++ program.”<<endl; return 0;

}

The above mentioned C++ program compilation using g++ command and execution is shown in following figure 9.

Printing warning messages

While compiling the source file, gcc sometimes produces certain warning messages for the programmer. Aim behind producing such warning messages is to make aware the programmer about certain compile-time messages which may be helpful in detecting various logical mistakes and common errors in technically correct programs. The –Wall option enables displaying warnings for several common errors. Let us understand significance of the the –Wall option using the following example.

file: warning.c

#include<stdio.h>

int main()

{

float simple_interest_rate;

printf(“Bank Simple interest rate is: %f\n”, simple_interest_rate);

return 0;

}

Compiling with –Wall option gcc not only generates the executable file but also notifies some warning message on screen. The warning message suggests that the variable simple_interest_rate is used without initialization (Figure 10). However, the resultant executable file with_warning.exe is successfully generated. Execution output is shown in figure 10.

In figure 10, it is seen that the binary executable is generated even though the compiler has raised compile-time warning. However, if the programmer desires, he may instruct the gcc to not to generate executable file in case of compile-time warnings. The –Werror option is used with the

–Wall option (figure 11).

Create preprocessor output only

Preprocessing is first inherent stage in the overall compilation process. It is possible to produce the preprocessor output only for a given C source file using –E option with gcc. By default it generates the preprocessor output on screen, which can be redirected to some output file using redirection operator >.

$ gcc –E cprogram.c > cprogram.i

The resultant preprocessor output file cprogram.i contains human-readable contents. Use any text editor or cat command to view contents of the file. Contents of the cprogram.i file is shown in figure 2.

Create assembly code only

After successful completion of preprocessor stage, the source code is translated to machine-specific assembly language. The gcc compiler also provides an option to produce the assembly code only for a given C source file using –S option.

$ gcc –S cprogram.c

The resultant human-readable file cprogram.s contains machine-specific assembly code. Use any text editor or cat command to view contents of the file. Contents of the cprogram.s file is shown in figure 3.

Create compiled code only (without linking)

After successful generation of the assembly code, gcc translates (compiles) the assembly code to object code. Use –C option to generate the compiled code only without linking the object file to other library files.

$ gcc –C cprogram.c

The –C option with the gcc command will create the machine-specific object code in cprogram.o output file.

Generate all intermediate files

It is also possible to generate all intermediate files using –save-temps option. The output shown in figure 12 is self-explanatory.

Treat char datatype as unsigned char datatype

Being ANSI compliant compiler, the gcc supports signed and unsigned char datatypes.

However if required, the programmer may instruct gcc to forcefully interpret the char type as unsigned char type. Use the –funsigned-char option with the command gcc to interpret char type as unsigned char. In following example, a negative value is assigned to the

somechar variable of type char.

File: unsigned.c

#include<stdio.h>

int main()

{

char somechar = -65; printf(“somechar = %d\n”,somechar);

return 0;

}

Compiling the program unsigned.c shown above without using the –funsigned-char option results into the output shown figure 13.

However, in presence of the –funsigned-char option, results into different behavior by gcc, shown in figure 14.

Contrary to the –funsigned-char option, there exists a –fsigned-char option, it is used to instruct gcc to interpret the char type as signed char type.

Compile-time macros

The gcc compiler supports compile-time macros. It is possible to supply the compile-time macros on gcc command-line using –D<macro-name> option. Following example, expects availability of macro MY_MACRO.

File: macros.c

#include<stdio.h>

int main()

{

#ifdef MY_MACRO

printf(“MY_MACRO is defined.\n”);

#else

printf(“MY_MACRO is not defined.\n”);

#endif

}

Execution behavior of the compiled program macros.c in presence and absence of the option –DMY_MACRO is shown in figure 15 & figure 16 respectively.

Supply path for include directory

The gcc compiler provides –I<path-to-headers> to supply the include directory-path on gcc command line:

$ gcc –Wall –I/home/joshi/cprograms/include cprogram.c

Link with library

Use the –l<library-name> to link your program with a library file. For example, the following command links the aprogram.c program file to the math library.

$ gcc –Wall aprogram.c –lmath –o aprogram.exe

Summary

Program compilation is required to transform the human-readable source code to machine executable code. The task of program source code compilation inherently coordinates various stages such as preprocessing, compilation, assembly and linking. Nowadays, several compilation software exist.

The GCC (GNU Compiler Collection) suite consists of compilation software such as gcc, g++, gobjc, gobjc++, gcj, gfortran, gnat and gccgo for programs written in various programming languages such as C, C++, Objective C, Objective C++, java, Fortran, Ada and Go respectively. GCC is freely available under the GNU GPL license.

you can view video on GNU Compiler Collection

References and Further Reading

Programming with GNU Software, Mike Loukides, O’Reilly Publication
An Introduction to GCC, Brian Gough, Network Theory Limited
The Definitive Guide to GCC, William von Hagen, Apress
Using the GNU Compiler Collection (GCC), Richard M. Stallman and GCC Developer Community, Weblink: https://gcc.gnu.org/onlinedocs/gcc-4.1.2/gcc.pdf