CODE: C CPP Binaries Speak Without Symbols or Types | Amr Tarek

At the source-code level, C and C++ feel expressive. We write variables with names, we define types, we organize data into structs and classes. The language gives us vocabulary, grammar, and abstraction. But none of that survives intact once the compiler finishes its job.

When a program becomes a binary, symbols disappear, types evaporate, and names die. What remains is not C++ - it is architecture.

From this moment onward, the real language of the program is defined by:

The Instruction Set Architecture (ISA)
The Application Binary Interface (ABI)
Calling conventions
The memory model

To understand binaries, crashes, exploits, or stripped executables, you must stop thinking like a C++ programmer and start thinking like the machine. Understanding binaries is not about language syntax—it is about architectural contracts.

ISA: The Vocabulary of the Machine

The Instruction Set Architecture defines _what the CPU can say_. It is the machine’s vocabulary.

The ISA specifies:

Which instructions exist (mov, add, call, cmp, …)
Operand sizes (8, 16, 32, 64 bits)
Addressing modes (register, immediate, base+offset)
Register names and roles

Consider this instruction:

mov DWORD PTR [rdi+4], eax

There are no types here. No int. No struct name. No variable identifier. Yet this single line communicates a surprising amount of information.

Architecturally, we know:

A memory write is occurring
The write width is 4 bytes
The destination is at base pointer + offset
The source value comes from a 32-bit register

From behavior alone, a reverse engineer infers:

This memory location likely represents an int
It is likely a field inside a structure
The field is placed at offset +4
The structure is accessed through a pointer stored in rdi

None of this comes from C++. It comes from ISA semantics.

The CPU does not know what an int is. It only knows how many bytes you touched.

ABI: The Grammar That Gives Meaning

If the ISA is the vocabulary, the ABI is the grammar. It defines _how instructions relate to each other across boundaries_.

The Application Binary Interface specifies:

Stack layout
Alignment rules
How function arguments are passed
Where return values live
How structures are laid out in memory

On x86-64 System V ABI, for example:

int → 4 bytes, aligned to 4 bytes
char → 1 byte, alignment 1
Struct alignment = maximum alignment of its fields
First arguments go in rdi, rsi, rdx, rcx, r8, r9
Return values go in rax

When a reverse engineer reconstructs a struct, they are not “guessing C++.”

They are reconstructing ABI expectations.

Architecture Replaces Symbol Names

Consider this stripped assembly sequence:

mov eax, [rdi+4]
mov byte ptr [rdi], al

There are no symbols. No type definitions. No debug info.

Yet architecture tells us everything that matters.

We observe:

[rdi] is accessed as 1 byte
[rdi+4] is accessed as 4 bytes
Both use the same base register (rdi)
The offsets are consistent
There is a gap between offset 0 and 4

From this, we infer a likely memory layout:

struct S {
    char a;      // offset 0
    char pad[3]; // implicit padding
    int b;       // offset 4
};

The padding was never written in source code—but the ABI forced it into existence.

This is critical:

**Padding is not a compiler accident. It is an architectural requirement.**

The architecture becomes the documentation.

Why Debug Builds Feel “Magical”

When debug symbols are present, everything feels different.

You see:

Function names
Variable names
Struct definitions
Type information
Source line mappings

This can trick developers into thinking:

“The binary knows my types.”

It does not.

What’s really happening is that the debugger overlays external metadata onto raw instructions. The binary itself still contains nothing but addresses, offsets, and opcodes.

Strip the symbols, and the illusion vanishes instantly.

The machine never knew your variable was called userCount.

It only knew you wrote 4 bytes at offset +12.

Architecture Thinking vs Language Thinking

At this level, your mental model must change completely.

Language Thinking	Architecture Thinking
Variables have names	Memory has addresses
Types define behavior	Access width defines behavior
Structs group fields	Offsets imply structure
References are special	Everything is a value
Types enforce safety	ABI enforces correctness

In binaries:

A reference is just a pointer value
A class is just memory plus functions
Encapsulation does not exist
Access patterns define meaning

If you reason about binaries using language thinking, you will be confused.

If you reason using architectural thinking, patterns emerge immediately.