x86 and x64 Calling Conventions: cdecl, stdcall, fastcall, and System V
Objective: Understand how the five major calling conventions —
cdecl,stdcall,fastcall, the Microsoft x64 ABI, and the System V AMD64 ABI — dictate argument passing, register ownership, stack cleanup, and alignment, and exactly why those rules determine where return addresses and arguments sit in memory when a vulnerability is triggered.
1. Why Calling Conventions Matter for Exploit Development
A calling convention is the contract between a caller and a callee. It specifies how arguments are passed (stack or registers), where the return value lands, which registers the callee must preserve, and who cleans up the stack. None of this is arbitrary — it is fixed by the ABI for a given platform and compiler.
For a defender or authorized red-teamer, this matters because stack layout is deterministic. When a local buffer overflows, the bytes that land on the saved return address are determined entirely by the convention in force. Reliable overflow payloads, return-to-libc chains, and ROP gadgets all depend on knowing precisely where the return address, arguments, and saved registers sit. Get the convention wrong and your offset math is wrong.
2. Stack Mechanics Refresher: PUSH, POP, CALL, RET
The stack grows downward (toward lower addresses). PUSH decrements the stack pointer (ESP/RSP) and writes; POP reads and increments it.
CALL targetpushes the return address (the next instruction’sEIP/RIP) onto the stack, then jumps.RETpops that saved address back into the instruction pointer.RET Npops the address and addsNtoESP— this is how a callee cleans caller-pushed arguments.
push arg1 ; arg on stack
call foo ; pushes return address, jumps to foo
add esp, 4 ; caller cleans 1 dword arg (cdecl)Because CALL writes the return address to a predictable slot, any write primitive that reaches that slot redirects control flow. Every convention below differs only in how the arguments around that slot are arranged.
3. x86 cdecl: The C Standard
__cdecl is the default for C functions on 32-bit x86 (MSVC flag /Gd). Arguments are pushed right to left, and the caller cleans the stack. The return value comes back in EAX. C names are decorated with a single leading underscore (_foo), no case translation.
Because the caller cleans up, cdecl is the only x86 convention that supports variadic functions (printf-style va_list) — the callee never needs to know the argument count.
; foo(1, 2, 3); -- cdecl
push 3 ; rightmost first
push 2
push 1 ; leftmost last
call _foo
add esp, 12 ; CALLER cleans 3 dwordsCanonical x86 stack frame at function entry (high → low address):
[arg N] ← pushed last (rightmost)
[arg 2]
[arg 1] ← pushed first
[return address] ← pushed by CALL
[saved EBP] ← pushed by prologue (PUSH EBP)
[local vars] ← ESP after SUB ESP, NThe saved EBP and return address are the primary targets of a stack-based overflow. Overflow a local buffer and you overwrite them in that exact order.

4. x86 stdcall: The Windows API Convention
__stdcall is the convention for the Win32 API. Arguments still push right to left, but the callee cleans the stack using RET N. This is efficient for fixed-argument functions, but it forbids variadics.
Name decoration encodes the byte count of stack arguments: a leading underscore, an @, then the size in bytes (always a multiple of 4). MessageBoxA with four pointer/int args becomes _MessageBoxA@16.
; foo(1, 2); -- stdcall, two dword args
push 2
push 1
call _foo@8
; NO add esp here — callee handled it
foo:
; ... body ...
ret 8 ; CALLEE pops 8 bytes of argsFor shellcode and custom loaders, the @N suffix matters when resolving and patching the Import Address Table — the decorated name must match the export.
5. x86 fastcall: Register-Based Argument Passing
__fastcall (MSVC flag /Gr) passes the first two integer arguments in ECX and EDX; remaining arguments push right to left, and the callee cleans them. Decoration uses a leading @ (e.g. @foo@8). All __fastcall functions must have prototypes.
; foo(1, 2, 3); -- MSVC fastcall
mov ecx, 1 ; arg1 in ECX
mov edx, 2 ; arg2 in EDX
push 3 ; arg3 on stack
call @foo@12⚠️ Compiler variance:
__fastcallis not standardized across compilers. MSVC usesECX/EDX. Borland passes the first three arguments inEAX,EDX,ECX. When reversing a non-MSVC binary, verify register usage before trusting any decompiler’s__fastcalllabel.
6. Microsoft x64 ABI: The Modern Windows Convention
On Windows x64 there is effectively one ABI; the /Gd, /Gr, /Gz flags only exist for x86 targets. The convention is a four-register fastcall:
| Argument slot | Integer register | Float register |
|---|---|---|
| 1 | RCX | XMM0 |
| 2 | RDX | XMM1 |
| 3 | R8 | XMM2 |
| 4 | R9 | XMM3 |
Key rules:
- One-to-one correspondence: each argument maps to exactly one register/slot; a single argument is never split across registers.
- Any argument larger than 8 bytes, or not sized 1/2/4/8 bytes, is passed by reference.
- Arguments beyond the first four go on the stack after the shadow space.
- The stack must be 16-byte aligned before
CALL. - The x87 stack is unused; all floating-point work uses the 16 XMM registers and is volatile across calls.
Shadow space (home space): the caller must allocate 32 bytes on the stack before the CALL, even if the callee takes fewer than four arguments, and reclaim it afterward. The callee may spill RCX/RDX/R8/R9 into this region.
; foo(a, b, c, d) -- Microsoft x64
mov rcx, a
mov rdx, b
mov r8, c
mov r9, d
sub rsp, 20h ; 32 bytes shadow space (caller's job)
call foo
add rsp, 20h ; reclaim shadow spaceVolatile (caller-saved): RAX, RCX, RDX, R8, R9, R10, R11, XMM4, XMM5.
Non-volatile (callee-saved): RBX, RBP, RDI, RSI, R12–R15, XMM6–XMM15.

7. System V AMD64 ABI: The Linux and macOS Convention
System V AMD64 is followed on Linux, macOS, FreeBSD, Solaris, and other POSIX systems. It uses six integer argument registers:
| Argument slot | Integer register | Float register |
|---|---|---|
| 1 | RDI | XMM0 |
| 2 | RSI | XMM1 |
| 3 | RDX | XMM2 |
| 4 | RCX | XMM3 |
| 5 | R8 | XMM4–XMM7 (5–8) |
| 6 | R9 |
Additional arguments push onto the stack in reverse order. The return value is in RAX; for 128-bit returns the high 64 bits go in RDX. The stack is 16-byte aligned just before CALL.
- Callee-saved:
RBX,RBP,R12–R15. All others are caller-saved. - Red zone: the 128 bytes below
RSPare reserved and untouched by signal/interrupt handlers. Leaf functions may use this area as their entire frame without adjustingRSP. - Syscall variant: kernel entry uses the same registers except
R10replacesRCX(because thesyscallinstruction clobbersRCX). - Varargs: for variadic functions,
RAXmust hold the number of vector (XMM) registers used, 0–8.
; write(1, buf, len) via syscall -- System V
mov rax, 1 ; sys_write
mov rdi, 1 ; fd (arg1)
mov rsi, buf ; buffer (arg2)
mov rdx, len ; count (arg3)
; NOTE: a syscall uses R10 in place of RCX for arg4
syscall
; leaf function may freely use [rsp-128 .. rsp] (red zone)⚠️ Shadow space vs. red zone are mutually exclusive and commonly confused. Shadow space (32 bytes above the call) exists only on Windows x64. The red zone (128 bytes below
RSP) exists only on System V. Never assume both.

8. Side-by-Side Comparison and ABI Detection in Disassembly
| Property | Microsoft x64 | System V AMD64 |
|---|---|---|
| Integer arg registers | RCX, RDX, R8, R9 | RDI, RSI, RDX, RCX, R8, R9 |
| FP arg registers | XMM0–XMM3 | XMM0–XMM7 |
| Shadow space | 32 bytes (mandatory) | None |
| Red zone | None | 128 bytes below RSP |
| Callee-saved | RBX, RBP, RDI, RSI, R12–R15, XMM6–15 | RBX, RBP, R12–R15 |
Recognition heuristics in IDA/Ghidra:
- A
sub rsp, 0x20immediately beforeCALLand arguments loaded intoRCX/RDX/R8/R9⇒ Microsoft x64. - Arguments loaded into
RDI/RSI/RDXand writes into[rsp-8]without a priorsub rsp⇒ System V (red zone). - A
ret N(non-zero immediate) on 32-bit code ⇒ stdcall or fastcall; arguments inECX/EDXdistinguish fastcall. - A bare
retwith caller-sideadd esp, N⇒ cdecl.
Automated ABI detection can misfire on hand-written assembly, non-MSVC fastcall, or -fomit-frame-pointer builds — always confirm against the actual prologue.
9. Calling Conventions as an Attack Surface
Each convention places the return address at a known offset from a local buffer. That offset is the difference between a working and a failing overflow.
In 64-bit binaries, overflowing a buffer controls stack contents, not registers directly — which is exactly why return-oriented programming is needed. To call a libc function on x64 Linux, you must first load the argument register: a pop rdi ; ret gadget sets arg 1 before the call. This is a direct consequence of the System V ABI placing arg 1 in RDI.
On Windows x64, the mandatory 32-byte shadow space shifts the offset from a local buffer to the saved return address by 32 bytes versus an equivalent Linux frame — a classic source of off-by-32 errors in cross-platform shellcode.
A conceptual offset calculator makes the dependency explicit:
def return_addr_offset(buf_size, conv):
# bytes from start of local buffer to the saved return address
if conv == "x86_cdecl" or conv == "x86_stdcall":
return buf_size + 4 # + saved EBP (4 bytes)
if conv == "sysv_amd64":
return buf_size + 8 # + saved RBP (8 bytes)
if conv == "ms_x64":
return buf_size + 8 + 0x20 # saved RBP + 32B shadow space
raise ValueError("unknown convention")Frame-pointer presence (-fomit-frame-pointer removes saved RBP) and shadow space both change the answer — which is why convention awareness precedes any reliable payload.

10. Common Attacker Techniques
| Technique | Description |
|---|---|
| Saved return-address overwrite | Overflow a local buffer to clobber the convention-determined return slot |
| Return-to-libc (x86) | Stack-arranged args (cdecl) let an attacker call system() without shellcode |
| ROP register loading (x64) | Use pop rdi ; ret / pop rcx ; ret gadgets to satisfy the ABI before a call |
| Shadow-space-aware stack pivot | Account for the 32-byte home space when chaining Windows x64 gadgets |
| IAT patching via decoration | Resolve _func@N decorated stdcall imports for shellcode loaders |
| Reflective API calls | Manually set up RCX/RDX/R8/R9 + shadow space before invoking LoadLibraryA |
Reflective loaders and injected shellcode must respect the target ABI exactly — wrong argument registers or a missing shadow allocation crashes the call.
11. Defensive Strategies & Detection
Note: A calling convention is a compile-time/binary property — no Sysmon Event ID fires because a convention is used. Detection is indirect: it triggers on the runtime artifacts of a convention-aware exploit.
Compile-time mitigations motivated directly by convention layout:
- Stack canaries —
/GS(MSVC),-fstack-protector-strong(GCC/Clang) detect return-address overwrite beforeRET. - Control Flow Guard —
/guard:cfvalidates indirectCALLtargets. - Intel CET / Shadow Stack — hardware enforces that
RETpops the addressCALLpushed, directly countering return-address overwrites. Mark binaries withIMAGE_DLLCHARACTERISTICS_GUARD_CET_COMPAT(0x4000). - ASLR + PIE — randomizes addresses so known layout still yields unknown absolute targets.
-mno-red-zone— hardens Linux kernel modules against red-zone clobbering.
Runtime telemetry for the exploitation aftermath:
- Sysmon Event ID 1 (Process Create) — anomalous children of network-facing services after a successful ROP/return-to-libc chain.
- Sysmon Event ID 10 (Process Access) —
VirtualAllocEx/WriteProcessMemoryfrom convention-correct injected shellcode. - Sysmon Event ID 7 (Image Load) — unexpected DLL loads from a corrupted return address redirecting into
LoadLibrary. - Microsoft-Windows-Threat-Intelligence ETW — kernel telemetry on
NtAllocateVirtualMemory/NtWriteVirtualMemory. - Audit Process Creation (Event
4688) with command-line logging.
title: Suspicious Child Process from Network-Facing Service After Exploitation
logsource:
product: windows
service: sysmon
detection:
selection:
EventID: 1
ParentImage|endswith:
- '\w3wp.exe'
- '\sqlservr.exe'
Image|endswith:
- '\cmd.exe'
- '\powershell.exe'
condition: selection
level: high12. Tools for Calling-Convention Analysis
| Tool | Description | Link |
|---|---|---|
| IDA Pro / Ghidra | Decompiler ABI inference and stack-frame reconstruction | ghidra-sre.org |
| x64dbg | Live register/stack inspection on Windows | x64dbg.com |
| GDB + pwndbg | Stack and register view on Linux (x/16gx $rsp) | gnu.org |
| WinDbg | Inspect shadow space and frame layout (dd rsp) | microsoft.com |
| Godbolt Compiler Explorer | Compare emitted asm across conventions/compilers | godbolt.org |
| ROPgadget / Ropper | Enumerate pop rdi ; ret-style register-loading gadgets | github.com |
| NASM | Hand-assemble convention test cases | nasm.us |
| Radare2 | Cross-platform disassembly and ABI heuristics | rada.re |
13. MITRE ATT&CK Mapping
| Technique | MITRE ID | Detection |
|---|---|---|
| Exploitation for Client Execution | T1203 | Crash telemetry, Event 4688 child-process anomalies |
| Exploit Public-Facing Application | T1190 | WAF/IDS, anomalous service children (Event ID 1) |
| Process Injection | T1055 | Sysmon Event ID 10 (VirtualAllocEx/WriteProcessMemory) |
| Process Injection: DLL Injection | T1055.001 | Event ID 7 unexpected LoadLibraryA loads |
| Command and Scripting Interpreter | T1059 | Event ID 1 cmd.exe/powershell.exe spawns |
| Reflective Code Loading | T1620 | ETW Threat-Intelligence memory-write telemetry |
ATT&CK has no technique ID for “calling-convention abuse” — convention knowledge is prerequisite craft underlying these exploitation and injection techniques.
Summary
- Calling conventions are the binary-level contract that makes stack layout deterministic — and therefore exploitable.
- x86 splits into
cdecl(caller cleanup, variadics,_foo),stdcall(calleeRET N,_foo@N), andfastcall(ECX/EDX, MSVC-specific vs. Borland’sEAX/EDX/ECX). - The two 64-bit ABIs differ in argument registers (
RCX,RDX,R8,R9vs.RDI,RSI,RDX,RCX,R8,R9), shadow space (Windows only) vs. red zone (System V only), and callee-saved sets. - Convention dictates the buffer-to-return-address offset and the ROP register-loading gadgets required —
pop rdi ; reton Linux, shadow-space accounting on Windows. - Detect the exploitation artifacts, not the convention: Sysmon Event IDs 1/7/10, ETW Threat-Intelligence telemetry, and Event
4688, hardened with canaries, CFG, and CET shadow stacks.
Related Tutorials
- Writing x64 Shellcode: Differences, Shadow Space, and Register Conventions
- x86 and x64 Assembly from Scratch
- Writing Your First Shellcode: x86 Reverse Shell from Scratch
- Egghunters: Staged Payload Delivery When Buffer Space Is Tight
- Shellcode Encoders: XOR Encoding, Custom Decoders, and Avoiding Bad Chars
References
- Calling Conventions (cdecl, stdcall, fastcall, and others) | Microsoft Learn
- x64 Calling Convention | Microsoft Learn
- x64 ABI Conventions (x64 Software Conventions) | Microsoft Learn
- System V Application Binary Interface AMD64 Architecture Processor Supplement (Official psABI PDF) | uclibc.org
- Calling Conventions for Different C++ Compilers and Operating Systems (Agner Fog) | agner.org
- x86 Disassembly/Calling Conventions | Wikibooks