When a program becomes a binary, symbols disappear, types evaporate, and names die. What remains is not C++ - it is architecture.
From this moment onward, the real language of the program is defined by:
- The Instruction Set Architecture (ISA)
- The Application Binary Interface (ABI)
- Calling conventions
- The memory model
To understand binaries, crashes, exploits, or stripped executables, you must stop thinking like a C++ programmer and start thinking like the machine. Understanding binaries is not about language syntax—it is about architectural contracts.
ISA: The Vocabulary of the Machine
The Instruction Set Architecture defines _what the CPU can say_. It is the machine’s vocabulary.
The ISA specifies:
- Which instructions exist (
mov,add,call,cmp, …) - Operand sizes (8, 16, 32, 64 bits)
- Addressing modes (register, immediate, base+offset)
- Register names and roles
Consider this instruction:
mov DWORD PTR [rdi+4], eax
There are no types here. No int. No struct name. No variable identifier. Yet this single line communicates a surprising amount of information.
Architecturally, we know:
- A memory write is occurring
- The write width is 4 bytes
- The destination is at base pointer + offset
- The source value comes from a 32-bit register
From behavior alone, a reverse engineer infers:
- This memory location likely represents an
int - It is likely a field inside a structure
- The field is placed at offset
+4 - The structure is accessed through a pointer stored in
rdi
None of this comes from C++. It comes from ISA semantics.
The CPU does not know what an int is. It only knows how many bytes you touched.
ABI: The Grammar That Gives Meaning
If the ISA is the vocabulary, the ABI is the grammar. It defines _how instructions relate to each other across boundaries_.
The Application Binary Interface specifies:
- Stack layout
- Alignment rules
- How function arguments are passed
- Where return values live
- How structures are laid out in memory
On x86-64 System V ABI, for example:
int→ 4 bytes, aligned to 4 byteschar→ 1 byte, alignment 1- Struct alignment = maximum alignment of its fields
- First arguments go in
rdi,rsi,rdx,rcx,r8,r9 - Return values go in
rax
When a reverse engineer reconstructs a struct, they are not “guessing C++.”
They are reconstructing ABI expectations.
Architecture Replaces Symbol Names
Consider this stripped assembly sequence:
mov eax, [rdi+4]
mov byte ptr [rdi], al
There are no symbols. No type definitions. No debug info.
Yet architecture tells us everything that matters.
We observe:
[rdi]is accessed as 1 byte[rdi+4]is accessed as 4 bytes- Both use the same base register (
rdi) - The offsets are consistent
- There is a gap between offset
0and4
From this, we infer a likely memory layout:
struct S {
char a; // offset 0
char pad[3]; // implicit padding
int b; // offset 4
};
The padding was never written in source code—but the ABI forced it into existence.
This is critical:
**Padding is not a compiler accident. It is an architectural requirement.**
The architecture becomes the documentation.
Why Debug Builds Feel “Magical”
When debug symbols are present, everything feels different.
You see:
- Function names
- Variable names
- Struct definitions
- Type information
- Source line mappings
This can trick developers into thinking:
“The binary knows my types.”
It does not.
What’s really happening is that the debugger overlays external metadata onto raw instructions. The binary itself still contains nothing but addresses, offsets, and opcodes.
Strip the symbols, and the illusion vanishes instantly.
The machine never knew your variable was called userCount.
It only knew you wrote 4 bytes at offset +12.
Architecture Thinking vs Language Thinking
At this level, your mental model must change completely.
| Language Thinking | Architecture Thinking |
|---|---|
| Variables have names | Memory has addresses |
| Types define behavior | Access width defines behavior |
| Structs group fields | Offsets imply structure |
| References are special | Everything is a value |
| Types enforce safety | ABI enforces correctness |
In binaries:
- A reference is just a pointer value
- A class is just memory plus functions
- Encapsulation does not exist
- Access patterns define meaning
If you reason about binaries using language thinking, you will be confused.
If you reason using architectural thinking, patterns emerge immediately.