In this blog post we will explain some fundamentals concepts about compilation in C and illustrate with an example how is the process of it.
What is compilation?
Compilation is the process the computer takes to convert a high-level programming language into a machine language that the computer can understand. The software which performs this conversion is called a compiler.
What is a compiler
The purpose of a compiler is to convert a text file with source code into a binary file (for example, an executable). Once the executable is created, it is used like any other program. For this it is not necessary to have the source code.
A compiled language (such as C or C++) is different from an interpreted language (for example, a shell script) or a pseudo-interpreted (for example, python).
In C, compilation will transform the C code of a program into native code, that is, a series of binary instructions that can be directly understood by the processor.
What is GCC?
GCC is an integrated compiler of the GNU project for C, C ++, Objective C, and Fortran; it is capable of receiving a source program in any of these languages and generating a binary executable program in the language of the machine where it has to run.
The acronym GCC stands for “GNU Compiler Collection”
More information check man gcc.
The phases of the compilation in c
The compilation process involves four successive stages: preprocessing, compilation, assembly, and linking.
First of all we will write the source code in C language, we will create the lex.c file with emacs: emacs lex.c
Then, inside we will put the following code
*main - Print "Hello Lex"
*Return: 0 successful exit
printf("Hello Lex\n");return (0);
The compilation is first started of with the preprocessing stage. In this part of the compilation, the preprocessor stage helps to removes comments and to interpret preprocessor directives. These directives are statements that begin with “#” (i.e. #include). All in all, this stage helps to reduce repetition in the source code.
If you want to print the result of the preprocessing stage, type this: gcc -E lex.c.
To resume the result we will add the command tail:
The compilation transforms the C code into the assembly language of our machine’s processor. This part of the process will make a file containing such assembly instructions with the extension “.s”.
Type the following code: gcc -S lex.c , now to see that you created the file lex.s, you will put ls lex* in your console:
To check the content of lex.s, type the following command: less lex.s
Assembly transforms the assembly language program into object code, a machine language binary file executable by the processor.
The name of the object file generated by the assembler is the same as the source file. The extension of the object file in UNIX, the extension is “.o”.
Type the following code: gcc -c lex.s , now to see that you created the file lex.o, you will put ls lex* in your console:
To check the content of lex.o, type the following command: less lex.o
Mainly, all the programs written in C use library functions. These library functions are pre-compiled, and the object code of these library files is stored with ‘.lib’ (or ‘.a’) extension. The main working of the linker is to combine the object code of library files with the object code of our program.
At the end of it all, the links of object files will help create an executable file.
Type the following code: gcc -o lex lex.o , now to see that you created the executable file lex:
To execute the file lex type the following command: ./lex, and check the result.
ALL IN ONE STEP!
In a program with a single source file, all the above process can be done in a single step, applying this code
$ gcc -o lex lex.c