CODE: C CPP Functions Reality

Once symbols are gone, a function is no longer a thing with a name.It is simply a region of code that obeys certain architectural rules.

Functions Reality

Imagine you open your laptop, fire up a C++ project, and write this innocent line:

int add(int a, int b) {
    return a + b;
}

You smile. You just created a function… right?

Well… not according to the CPU.

The processor staring at your compiled binary has no concept of a function, no idea what add means, no clue what parameters are, and definitely no interest in your elegant C++ syntax.

It only understands 3 brutal truths:

Jump to an address
Save enough state to come back
Return when done

Everything else is a human invention.

Let’s walk through how chaos becomes structure.

The CPU’s World: Only Jumps Exist

You compile your code and inspect the binary. Among thousands of instructions, you see:

call 0x4012a0

The CPU interprets call like this:

Push the return address onto the stack
Move the instruction pointer to 0x4012a0

That’s it.

There is:

No function name
No argument list
No signature
No type
No object representing the function

It’s just:

SAVE → JUMP → EXECUTE → RETURN

A function at this level is nothing more than organized control flow.

The Stack Frame: A Reverse Engineer’s Compass

Most functions follow a recognizable pattern called a stack frame. Not because the CPU needs it, but because the ABI encourages it.

A common x86-64 System V ABI prologue looks like:

push rbp
mov rbp, rsp
sub rsp, 0x20

Let’s decode it like a detective:

Instruction	Meaning we infer
`push rbp`	Save old frame base
`mov rbp, rsp`	Establish new frame pointer
`sub rsp, 0x20`	Reserve 32 bytes for locals

The CPU sees only memory reservation.

The reverse engineer sees:

Local variables exist at fixed rbp offsets
The function expects 16-byte stack alignment
rbp marks a function boundary

Even in stripped binaries, this pattern reveals structure.

Example: after compiling our add function without optimization, it may look like:

push rbp
mov rbp, rsp
mov DWORD PTR [rbp-0x4], edi
mov DWORD PTR [rbp-0x8], esi
mov eax, edi
add eax, esi
pop rbp
ret

Notice something magical:

There are no variable names, but we still understand that:

edi and esi are arguments
They were stored as locals on the stack
Result returned in eax

Local variables exist without identity — just offsets and lifetimes.

Arguments Without Parameters

In C++ we declare:

int sum(int a, int b);

But the CPU only sees this:

mov edi, 5
mov esi, 7
call 0x401180

Because the System V ABI defines:

1st argument → rdi
2nd argument → rsi
Return → rax

We can infer the missing signature instantly.

Another example with more parameters:

void log_values(int a, int b, int c, int d, int e, int f, int g) { }

The compiled call will look like:

mov edi, 1      ; rdi
mov esi, 2      ; rsi
mov edx, 3      ; rdx
mov ecx, 4      ; rcx
mov r8d, 5      ; r8
mov r9d, 6      ; r9
push 7          ; 7th → stack
call 0x401200
add rsp, 8      ; stack cleanup

Even without names, the contract gives them meaning.

Return Values Without Types

Look at this code fragment in assembly:

call 0x401180
add eax, 1

We know:

The function returned a value
It fits in 32 bits
It is being used as an integer
It came through rax/eax

The type is defined by usage pattern, not declaration.

Floating-point example:

call 0x401180
add eax, 1

Call + usage in assembly:

movss xmm0, [a]
movss xmm1, [b]
call 0x401300
addss xmm0, xmm2  ; returned float used in FP math
ret

We know the function returns float because it returns via xmm0 and is consumed by FP instructions.

When Stack Frames Disappear (Optimization Strikes Back)

You enable -O2 and compile again. Suddenly your detective compass is gone:

sub rsp, 0x18
mov DWORD PTR [rsp+0xc], edi
mov DWORD PTR [rsp+0x8], esi
add edi, esi
mov eax, edi
ret

There is:

No rbp
No stable frame
No variable offsets tied to a base pointer

But the function still exists.

It just got… naked.

Reverse engineers now track:

Stack deltas
Slot lifetimes
Register behavior
Return instructions

The complexity didn’t increase — the training wheels were removed.

Leaf & Tail-Call Functions: The Minimalist Monks

Some functions don’t use the stack at all:

add edi, esi
mov eax, edi
ret

This is a leaf function (calls nothing else).

Others may end with a tail call:

jmp 0x401500   ; tail call instead of call+ret

Still functions, still valid — just different shape.

Caller-Saved vs Callee-Saved: The Invisible Discipline

The ABI enforces register survival rules:

Caller-saved	Callee-saved
`rax, rcx, rdx, rdi, rsi, r8–r11`	`rbx, rbp, r12–r15`

So when you see:

push rbx
...
pop rbx

You instantly know:

This is a function boundary
It modified rbx
It restored it to obey the contract

Even stripped binaries reveal function edges by this behavior.

How Reverse Engineers Discover Functions

They don’t search for names.

They search for contracts and patterns:

call instructions
Prologue sequences
ret instructions
Register save/restore symmetry
Stack alignment fixes
Instruction pointer destinations

Functions are inferred, not declared.

Why This Matters to C++ System Designers

Because bad design becomes bad binary behavior.

Expensive API

struct Big { char data[512]; };

Big process(Big input);

ABI sees:

512 bytes copied by value
Possibly hidden sret pointer
Stack used unnecessarily

ABI-aware fix

void process(const Big& input, Big& output);

Now:

No large copy
No hidden return pointer
Registers carry meaning
Stack stays clean

This is why high-performance C++ requires ABI awareness.

The Core Definition

A function exists in machine code if and only if:

Arguments arrive in the correct places
Required state survives the call
Control returns to the right address

int foo();

is decoration.

This:

jump → save → restore → ret

is reality.

When you looked at:

mov edi, 5
mov esi, 7
call 0x401180

You asked yourself:

_“What are edi, esi, edx… and why do they sometimes hold hex like 0x5 or 0xA?”_

Let’s answer that clearly.

Register Names: Not Variables — Just Lanes for Data

In x86-64 (the architecture your examples come from), registers are general-purpose storage slots inside the CPU.

Each register has multiple “views” depending on size:

64-bit	32-bit view	16-bit view	8-bit view
`rdi`	`edi`	`di`	`dil`
`rsi`	`esi`	`si`	`sil`
`rdx`	`edx`	`dx`	`dl`
`rcx`	`ecx`	`cx`	`cl`
`r8`	`r8d`	`r8w`	`r8b`
`r9`	`r9d`	`r9w`	`r9b`
`rax`	`eax`	`ax`	`al`

So:

edi is not a variable named "edi"
It is the 32-bit slice of register rdi
Compilers use it to pass int values by convention

Why Do We See Hex Values in Registers?

Because assembly shows you the actual literal values being placed in them.

Example in C++:

int a = 10;
int b = 15;
int c = a + b;

Compiler might generate:

mov edi, 0xA     ; 10 decimal → 0xA in hex
mov esi, 0xF     ; 15 decimal → 0xF in hex
add edi, esi     ; 10 + 15 = 25
mov eax, 0x19    ; 25 decimal → 0x19 hex

So hex is just another number format:

Decimal	Hex
5	`0x5`
10	`0xA`
15	`0xF`
25	`0x19`
255	`0xFF`

The CPU doesn’t care — it’s all binary.

Role of Common Registers in Function Calls (System V ABI)

By contract:

Argument order	Register lane
1st	`rdi` (32-bit: `edi`)
2nd	`rsi` (`esi`)
3rd	`rdx` (`edx`)
4th	`rcx` (`ecx`)
5th	`r8` (`r8d`)
6th	`r9` (`r9d`)

So when you see this call:

mov edi, 0x2C       ; 44 decimal
mov esi, 0x64       ; 100 decimal
mov edx, 0xFF       ; 255 decimal
call 0x401180

You infer:

someFunction(44, 100, 255);

Even though the signature was stripped.

Function Returns Also Use a Lane

Return values come back via rax:

call 0x401180
mov ebx, eax       ; copying returned int into ebx

Equivalent to C++:

int result = foo();
int x = result;

A Mini Story Example Putting It All Together

C++ Code

int compute(int x, int y, int z) {
    return (x + y) * z;
}

int main() {
    int a = 4;
    int b = 5;
    int c = 3;
    int r = compute(a, b, c);
    r += 1;
}

What the CPU Actually Sees (one possible compiled output)

main:
    mov edi, 0x4      ; a = 4
    mov esi, 0x5      ; b = 5
    mov edx, 0x3      ; c = 3
    call 0x401200     ; compute(a, b, c)

    add eax, 0x1      ; r += 1
    ret

401200 <compute>:
    add edi, esi      ; x + y  → edi now = 9 (0x9)
    imul edi, edx     ; (x + y) * z → 9 * 3 = 27 (0x1B)
    mov eax, edi      ; return value → rax lane
    ret

Reverse Engineer Interpretation

Function starts at 0x401200
3 arguments passed in rdi, rsi, rdx
Math performed using integer ops (add, imul)
Return used as 32-bit integer (eax)
Control returned via ret
Caller modified return value after call

All meaning reconstructed without symbols.

6. Key Insight for You Going Forward

When you see register names + hex values:

They are value lanes, not variables
Hex is just the literal numeric representation
The ABI gives them semantic meaning
You can map them back to C/C++ parameters
Function boundaries are recognized by behavior symmetry, not names