How the C++ Compiler works

Mayowa Obisesan
3 min readDec 17, 2023
Generated by Bing Image Creator

Compiling is the process of converting C++ source code to binary code that the computer can actually understand. C++ code is just text. It is just letters and words that have a predefined syntax to perform the logic defined with that syntax. So, if C++ code or any other program’s code is just text, two question comes to mind.

  1. How does that text become an executable file that becomes games and applications that I use?
  2. Why is it that the text I call C++ code (my C++ source code) can only be run with a specific application and the text I call Python source code can only be run with a totally different application?

The answer to these questions is the compiler. Let me explain.

THE COMPILER

The compiler is responsible for the transformation of source code to an executable binary in C++ and this transformation requires two phases from start to finish.

  1. Compiling Phase
  2. Linking Phase

But the focus of this article is on compiling. More about linking at the end of this article.

COMPILING PHASE

The function of the C++ compiler is to take the text file and transform those text file into an intermediate format called object files. Those object files are then passed to the Linker (if there are multiple object files).

The compiler does several things when it produces the object files.

  1. Preprocessing: This is the stage where the compiler goes through all of the preprocessor statements in our code and evaluates them. Example of preprocessor directives in C++ code are: include, define, if, ifdef, and pragma statements. More about preprocessors in C++ at the end of this article.
  2. Tokenization: This is when the pre-processed C++ code is split into tokens.
  3. Parsing: This is the walk through of the tokens in the tokenization stage. The tokens are first represented as a Concrete Syntax Tree and after further parsing for meaningful information about the C++ code, into an Abstract Syntax Tree that will be used to generate object files. The Abstract Syntax tree consists of constant data and / or instructions which is then used to generate the machine code that the computer processor understands.

Object Files

Object files are the output of compiling our C++ source code. They are the resulting machine code that the processor understands and reads. Object files are the result of compiling C++ translation units.

LINKING PHASE

The C++ Linker in the Linking phase makes use of object files generated from translation units during compilation. These object files are linked to generate a single executable binary. The reason for linking is to link all resources that have been used in multiple translation units into one executable binary. An example of these resources are functions that are included in multiple translation units which are referenced using the function signature.

For more understanding about linking. I’ll link to an article about C++ linking at the end of this article.

CONCLUSION

C++ compilers help turn your C++ source code which are just texts until they are triggered for execution into binary code which the computer understands. But the linker does the job of linking all the multiple object files that you may have in your C++ project into one executable binary that can be executed as a lone file. This is the process of compilation from source code to executable binary.

OTHER ARTICLES

Here are other articles that you can make reference to for better understanding of some of the concepts I used in this article.

I believe you have learnt something from this article.

Thanks for reading. 🙂

--

--

Mayowa Obisesan

Entrepreneur | Computer Scientist | Quantum Theory Enthusiast