However, once we move below the language and look at how programs actually run on a real system, an important realization appears:
`main()` is not the first code that executes.
To understand why, we must step outside the language and look at the runtime and operating system responsibilities that surround your program.
Why `main()` Cannot Be the First Instruction
When you execute a program, the operating system does far more than simply “call a function.” Before your code runs, the system must:
- Load the executable into memory
- Create a process
- Set up a stack and heap
- Prepare command-line arguments and environment variables
- Initialize language runtime support
All of this happens before your source code is ever reached.
Because of this, execution must begin in a piece of code that knows how to perform these tasks. That code is not written by you—it is provided by the runtime.
The Real Entry Point: Runtime Startup Code
At the binary level, execution begins at a symbol commonly named _start.
This symbol is injected by the compiler and linker as part of the C runtime (CRT). Its job is not to implement program logic, but to prepare the environment so that main() can safely run.
Conceptually, the startup sequence looks like this:
- Operating system jumps to
_start - Runtime initializes the execution environment
- Command-line arguments are prepared
- Global and static initialization is performed
main(argc, argv)is called
Only at step five does your code finally begin executing.
This explains why main() has a well-defined signature and why you do not control how it is called.
What Happens After `main()` Returns
When main() finishes execution and returns a value, the program does not immediately terminate.
Instead, control flows back into the runtime, which performs a controlled shutdown sequence:
- The return value of
main()is captured - Registered
atexit()handlers are executed - Static and global destructors run (in C++)
- Standard output buffers are flushed
- The operating system is notified of program termination
This process explains several important behaviors:
- Why destructors matter
- Why buffered output may not appear if a program crashes
- Why abrupt termination can cause resource leaks
From this point onward, it becomes clear that writing correct programs is not just about logic—it is about respecting the execution lifecycle.
How Code Becomes a Program
Now that we understand how a program starts and ends, a deeper question naturally follows:
How does source code become something the operating system can execute?
This transformation happens through a structured compilation pipeline that applies equally to both C and C++.
The Compilation Pipeline:
Every C and C++ program passes through the same four fundamental stages:
- Preprocessing
- Compilation
- Assembly
- Linking
Each stage exists because the next stage cannot function without it.
Preprocessing: Preparing the Source
The preprocessor handles directives such as:
#include#define- Conditional compilation
At this stage, the compiler performs pure text manipulation.
No syntax checking or type validation occurs here.
This explains why missing headers or incorrect macros can cause errors that appear unrelated to your logic.
Compilation: Understanding the Language
During compilation, the compiler:
- Parses syntax
- Checks types
- Builds control-flow graphs
- Applies optimizations
- Emits assembly code
At this stage, each source file is handled independently.
The compiler does not know whether referenced functions exist elsewhere—it only trusts declarations.
Assembly: Generating Machine Code
The assembler converts assembly instructions into machine code, producing object files.
These files contain:
- CPU instructions
- Symbol information
- Relocation data
However, object files are still incomplete programs. They cannot run on their own.
Linking: Forming the Executable
The linker combines all object files and libraries into a single executable by:
- Resolving symbol references
- Connecting function calls across files
- Linking the standard library
- Producing a final binary
This is where errors such as “undefined reference” originate, and why correct declarations and definitions are critical.
Translation Units: Why Headers Exist
Each .c or .cpp file is compiled as a translation unit.
The compiler:
- Sees only the current file
- Requires headers for declarations
- Relies on the linker to resolve definitions
This design explains:
- Why headers separate declarations from definitions
- Why multiple definitions cause linker errors
- Why
static,extern, andinlineexist
Once this model is understood, compilation and linking errors become predictable rather than mysterious.
Program Memory Layout: Where Everything Lives
When a program finally runs, its code and data are placed into memory according to a defined layout.
A typical process includes:
- Code (text segment)
- Read-only data
- Global and static variables
- Heap
- Stack
Each region has a distinct purpose and lifetime.
The Stack: Automatic Storage
The stack stores:
- Function parameters
- Local variables
- Return addresses
It is fast, limited in size, and automatically managed.
void example() {
int value = 10;
}
value exists only for the duration of the function call.
The Heap: Dynamic Storage
The heap is used for dynamically allocated memory:
int* p = malloc(sizeof(int));
Heap memory:
- Has explicit lifetime
- Must be manually managed in C
- Is the source of memory leaks and fragmentation
Global and Static Storage
Global and static variables exist for the entire lifetime of the program.
In C++, this region also hosts:
- Static objects
- Global constructors and destructors
Understanding this region is essential for reasoning about initialization order and program shutdown behavior.
Why This Knowledge Changes How You Write Code
At this point, programming stops being about syntax and starts being about execution control.
Understanding startup, compilation, and memory layout allows you to:
- Avoid undefined behavior
- Manage lifetime correctly
- Write predictable and performant code
- Debug crashes and linker errors with confidence
- Design systems that scale
This is the foundation required for embedded systems, operating systems, high-performance software, and real-time applications.