CODE: C CPP Before Main | Amr Tarek

we established that every C and C++ program starts execution from a function called main(). From a language-level perspective, this statement is correct: main() is the function you write, and it defines where your program’s logic begins.

However, once we move below the language and look at how programs actually run on a real system, an important realization appears:

`main()` is not the first code that executes.

To understand why, we must step outside the language and look at the runtime and operating system responsibilities that surround your program.

Why `main()` Cannot Be the First Instruction

When you execute a program, the operating system does far more than simply “call a function.” Before your code runs, the system must:

Load the executable into memory
Create a process
Set up a stack and heap
Prepare command-line arguments and environment variables
Initialize language runtime support

All of this happens before your source code is ever reached.

Because of this, execution must begin in a piece of code that knows how to perform these tasks. That code is not written by you—it is provided by the runtime.

The Real Entry Point: Runtime Startup Code

At the binary level, execution begins at a symbol commonly named _start.

This symbol is injected by the compiler and linker as part of the C runtime (CRT). Its job is not to implement program logic, but to prepare the environment so that main() can safely run.

Conceptually, the startup sequence looks like this:

Operating system jumps to _start
Runtime initializes the execution environment
Command-line arguments are prepared
Global and static initialization is performed
main(argc, argv) is called

Only at step five does your code finally begin executing.

This explains why main() has a well-defined signature and why you do not control how it is called.

What Happens After `main()` Returns

When main() finishes execution and returns a value, the program does not immediately terminate.

Instead, control flows back into the runtime, which performs a controlled shutdown sequence:

The return value of main() is captured
Registered atexit() handlers are executed
Static and global destructors run (in C++)
Standard output buffers are flushed
The operating system is notified of program termination

This process explains several important behaviors:

Why destructors matter
Why buffered output may not appear if a program crashes
Why abrupt termination can cause resource leaks

From this point onward, it becomes clear that writing correct programs is not just about logic—it is about respecting the execution lifecycle.

How Code Becomes a Program

Now that we understand how a program starts and ends, a deeper question naturally follows:

How does source code become something the operating system can execute?

This transformation happens through a structured compilation pipeline that applies equally to both C and C++.

The Compilation Pipeline:

Every C and C++ program passes through the same four fundamental stages:

Preprocessing
Compilation
Assembly
Linking

Each stage exists because the next stage cannot function without it.

Preprocessing: Preparing the Source

The preprocessor handles directives such as:

#include
#define
Conditional compilation

At this stage, the compiler performs pure text manipulation.

No syntax checking or type validation occurs here.

This explains why missing headers or incorrect macros can cause errors that appear unrelated to your logic.

Compilation: Understanding the Language

During compilation, the compiler:

Parses syntax
Checks types
Builds control-flow graphs
Applies optimizations
Emits assembly code

At this stage, each source file is handled independently.

The compiler does not know whether referenced functions exist elsewhere—it only trusts declarations.

Assembly: Generating Machine Code

The assembler converts assembly instructions into machine code, producing object files.

These files contain:

CPU instructions
Symbol information
Relocation data

However, object files are still incomplete programs. They cannot run on their own.

Linking: Forming the Executable

The linker combines all object files and libraries into a single executable by:

Resolving symbol references
Connecting function calls across files
Linking the standard library
Producing a final binary

This is where errors such as “undefined reference” originate, and why correct declarations and definitions are critical.

Translation Units: Why Headers Exist

Each .c or .cpp file is compiled as a translation unit.

The compiler:

Sees only the current file
Requires headers for declarations
Relies on the linker to resolve definitions

This design explains:

Why headers separate declarations from definitions
Why multiple definitions cause linker errors
Why static, extern, and inline exist

Once this model is understood, compilation and linking errors become predictable rather than mysterious.

Program Memory Layout: Where Everything Lives

When a program finally runs, its code and data are placed into memory according to a defined layout.

A typical process includes:

Code (text segment)
Read-only data
Global and static variables
Heap
Stack

Each region has a distinct purpose and lifetime.

The Stack: Automatic Storage

The stack stores:

Function parameters
Local variables
Return addresses

It is fast, limited in size, and automatically managed.

void example() {
    int value = 10;
}

value exists only for the duration of the function call.

The Heap: Dynamic Storage

The heap is used for dynamically allocated memory:

int* p = malloc(sizeof(int));

Heap memory:

Has explicit lifetime
Must be manually managed in C
Is the source of memory leaks and fragmentation

Global and Static Storage

Global and static variables exist for the entire lifetime of the program.

In C++, this region also hosts:

Static objects
Global constructors and destructors

Understanding this region is essential for reasoning about initialization order and program shutdown behavior.

Why This Knowledge Changes How You Write Code

At this point, programming stops being about syntax and starts being about execution control.

Understanding startup, compilation, and memory layout allows you to:

Avoid undefined behavior
Manage lifetime correctly
Write predictable and performant code
Debug crashes and linker errors with confidence
Design systems that scale

This is the foundation required for embedded systems, operating systems, high-performance software, and real-time applications.