APCs: Asynchronous Procedure Calls and Thread Hijacking Surface

Objective: Understand the Windows Asynchronous Procedure Call mechanism from the kernel up — the KAPC / KAPC_STATE structures, the dispatch path through KiInsertQueueApc and KiDeliverApc, the alertable-wait requirement, and the three abuse variants (classic, early-bird, special user APC) used for thread hijacking and process injection — and detect them with Sysmon, ETW-TI, and audit policy.


1. APC Fundamentals — What the OS Actually Uses APCs For

An Asynchronous Procedure Call is a function that executes asynchronously in the context of a specific thread. When the kernel queues an APC, it raises a software interrupt and arranges for the routine to run the next time that thread is dispatched. Every thread has its own APC queue — APCs are inherently thread-targeted, which is exactly why offensive tooling loves them.

The OS itself relies on APCs for normal work:

  • I/O completion: ReadFileEx, WriteFileEx, and SetWaitableTimer deliver their completion callback via a user-mode APC queued back to the issuing thread.
  • File-system filter callbacks: normal kernel APCs are widely used by file systems and minifilters.
  • Wait abortion: queuing a user APC against a thread in an alertable wait satisfies the wait with STATUS_USER_APC.

Understanding APCs means understanding three things in sequence: who can queue them, when they fire, and what the thread looks like at the moment they fire.


2. The Three Flavours of APCs

APCs differ by IRQL and by who is allowed to queue them. The kernel maintains distinct semantics for each.

TypeIRQLNotes
Special Kernel APCAPC_LEVELRuns in kernel mode at IRQL APC_LEVEL; preempts user-mode code and kernel-mode code executing at PASSIVE_LEVEL. Used by the OS for operations such as I/O request completion.
Normal Kernel APCPASSIVE_LEVELRuns in kernel mode at PASSIVE_LEVEL; preempts all user-mode code, including user APCs. Generally used by file systems and file-system filter drivers.
User-mode APCPASSIVE_LEVELGenerated by an application. The target thread must be in an alertable state for a user-mode APC to run.

Unlike deferred procedure calls (DPCs), which run in arbitrary thread context, an APC always executes inside a specific thread’s context — that property is what makes APCs both useful for I/O completion and dangerous as an injection primitive.


Hierarchy diagram showing the three APC types: Kernel-Mode, User-Mode, and Special User APC, with their respective queuing APIs and alertable-wait requirements
The three APC flavours differ by privilege level, delivery trigger, and the Win32/native APIs used to queue them.

3. Kernel Structures: KAPC, KAPC_STATE, KTHREAD

A queued APC is represented in the kernel by a KAPC object. The thread tracks its pending APCs via a KAPC_STATE embedded in KTHREAD.

// Conceptual layout — field names are illustrative; confirm against the
// target Windows build with `dt nt!_KAPC` / `dt nt!_KAPC_STATE` in WinDbg.

typedef struct _KAPC {
    UCHAR              Type;
    UCHAR              SpareByte0;
    UCHAR              Size;
    UCHAR              SpareByte1;
    ULONG              SpareLong0;
    struct _KTHREAD   *Thread;
    LIST_ENTRY         ApcListEntry;
    PKKERNEL_ROUTINE   KernelRoutine;
    PKRUNDOWN_ROUTINE  RundownRoutine;
    PKNORMAL_ROUTINE   NormalRoutine;
    PVOID              NormalContext;
    PVOID              SystemArgument1;
    PVOID              SystemArgument2;
    CCHAR              ApcStateIndex;
    KPROCESSOR_MODE    ApcMode;
    BOOLEAN            Inserted;
} KAPC, *PKAPC;

typedef struct _KAPC_STATE {
    LIST_ENTRY         ApcListHead[2];   // [0] = kernel APCs, [1] = user APCs
    struct _KPROCESS  *Process;
    BOOLEAN            KernelApcInProgress;
    BOOLEAN            KernelApcPending;
    BOOLEAN            UserApcPending;
    // SpecialUserApcPending was added later for RS5+ Special User APCs.
} KAPC_STATE, *PKAPC_STATE;

Key fields the dispatcher and attackers both care about:

  • KAPC.NormalRoutine — the function the thread will eventually execute.
  • KAPC.NormalContext, SystemArgument1, SystemArgument2 — arguments passed to NormalRoutine.
  • KAPC.ApcModeKernelMode vs UserMode, controls which queue and which delivery path.
  • KAPC_STATE.ApcListHead[2] — two doubly-linked lists; index 0 holds kernel-mode APCs, index 1 holds user-mode APCs.
  • KAPC_STATE.UserApcPending — set to TRUE when a user APC is queued and the thread is in an alertable wait; this is the signal that breaks the wait with STATUS_USER_APC.

4. The Alertable Wait Requirement

A user-mode APC does not fire whenever the kernel wants — it fires only when the target thread is willing to be interrupted. A thread enters an alertable state by calling one of:

  • SleepEx()
  • SignalObjectAndWait()
  • MsgWaitForMultipleObjectsEx()
  • WaitForMultipleObjectsEx()
  • WaitForSingleObjectEx()

with the bAlertable parameter set to TRUE. Additionally, ReadFileEx, WriteFileEx, and SetWaitableTimer are themselves implemented using APCs as their completion-notification mechanism — so threads driving overlapped I/O routinely sit in alertable waits.

This alertable-state requirement is the single most important property to understand offensively and defensively:

  • Offensively, it dictates target selection. Long-lived service threads in svchost.exe or explorer.exe that pump I/O are reliable targets; threads that never enter an alertable wait will never run a queued user APC.
  • Defensively, it explains why the classic injection works against some processes and not others — and why attackers eventually moved to Special User APCs to remove the dependency entirely (§9).

5. Win32 → Native → Kernel Call Chain

Queuing a user APC traverses three layers.

API / SymbolLayerDescription
QueueUserAPCWin32 (kernel32.dll)Queues a user-mode APC to a target thread.
NtQueueApcThreadNT native (ntdll.dll)Syscall used internally by QueueUserAPC to deliver the APC.
NtQueueApcThreadExNT nativeExtended form; RS5 introduced Special User APCs queued by passing 1 as the reserve handle.
NtQueueApcThreadEx2NT nativeNewer variant exposing both UserApcFlags and MemoryReserveHandle.
QueueUserAPC2kernelbase.dllWrapper that exposes Special User APCs to user code.
KeInsertQueueApcKernelAttaches the initialized KAPC to the target thread’s queue.
KiDeliverApcKernelDispatches pending APCs at the kernel→user transition.
ntdll!RtlDispatchAPCntdllTrampoline in user mode that calls the caller-supplied APCProc.

An important internal detail: when you call QueueUserAPC(pfn, hThread, dwData), the function pointer ntdll actually hands to NtQueueApcThread is not your pfn — it is ntdll!RtlDispatchAPC, and your pfn is passed as a parameter. This is why call-stack-aware EDRs frequently see RtlDispatchAPC as the immediate caller of the suspicious user-mode routine.

The dispatch sequence for a user-mode APC:

  1. Caller obtains a thread handle with THREAD_SET_CONTEXT access.
  2. QueueUserAPCNtQueueApcThread → kernel enters KiInsertQueueApc.
  3. KiInsertQueueApc checks whether the target is in an alertable wait with WaitMode == UserMode. If yes, it sets UserApcPending = TRUE and completes the wait with STATUS_USER_APC.
  4. On the kernel→user transition, KiDeliverApc redirects execution to ntdll!RtlDispatchAPC, which invokes the original APCProc.

Flow diagram of the APC dispatch chain from QueueUserAPC through NtQueueApcThread, KiInsertQueueApc, KiDeliverApc, RtlDispatchAPC, to the final APCProc callback
Every layer of the APC dispatch chain is observable; EDRs see RtlDispatchAPC as the immediate caller of the injected routine.

6. Inspecting APC State in WinDbg

Read-only kernel introspection lets defenders and learners watch the structures the dispatcher mutates.

0: kd> !process 0 0 lsass.exe
0: kd> .process /r /p <EPROCESS>
0: kd> !thread <ETHREAD>

0: kd> dt nt!_KTHREAD <addr> ApcState
0: kd> dt nt!_KAPC_STATE <addr+offset>
   +0x000 ApcListHead       : [2] _LIST_ENTRY
   +0x020 Process           : Ptr64 _KPROCESS
   +0x028 KernelApcInProgress : UChar
   +0x029 KernelApcPending  : UChar
   +0x02a UserApcPending    : UChar

0: kd> !list "-t nt!_KAPC.ApcListEntry.Flink -e -x \"dt nt!_KAPC @$extret\" <ApcListHead[1]>"

Walking ApcListHead[1] for any thread reveals every pending user APC — its NormalRoutine, NormalContext, and ApcMode. On a healthy thread you typically see nothing; finding NormalRoutine pointing into a private RX region inside a system process is a classic incident-response artifact.


7. Classic APC Injection

The textbook variant. Every API call below is observable; the technique relies entirely on existing, documented APIs.

// Educational illustration of the API call chain only.
// No payload is included; `payload` is a placeholder used by defenders to
// recognize the pattern. Authorized testing only.

#include <windows.h>
#include <tlhelp32.h>

BOOL InjectViaAPC(DWORD pid, DWORD tid, const BYTE *payload, SIZE_T cb) {
    HANDLE hProc = OpenProcess(
        PROCESS_VM_OPERATION | PROCESS_VM_WRITE | PROCESS_QUERY_INFORMATION,
        FALSE, pid);
    if (!hProc) return FALSE;

    HANDLE hThread = OpenThread(THREAD_SET_CONTEXT, FALSE, tid);
    if (!hThread) { CloseHandle(hProc); return FALSE; }

    LPVOID remote = VirtualAllocEx(hProc, NULL, cb,
                                   MEM_COMMIT | MEM_RESERVE,
                                   PAGE_EXECUTE_READWRITE);
    WriteProcessMemory(hProc, remote, payload, cb, NULL);

    // QueueUserAPC schedules execution; it fires only when the target
    // thread enters an alertable wait.
    QueueUserAPC((PAPCFUNC)remote, hThread, 0);

    CloseHandle(hThread);
    CloseHandle(hProc);
    return TRUE;
}

Trigger conditions:

  • The target thread (tid) must enter an alertable wait. In long-lived service hosts this happens routinely.
  • The handle to the thread must carry THREAD_SET_CONTEXT. This is the most reliable single indicator: Sysmon EID 10 with a GrantedAccess mask covering THREAD_SET_CONTEXT against a high-value target image is the canonical detection (§12).

Notably, no new thread is created in the victim processCreateRemoteThread is not called. This is exactly why APC injection evades Sysmon EID 8.


8. Early-Bird APC Injection

Classic injection has one weakness: you cannot predict when the victim thread will next become alertable. Early-bird removes the guesswork by injecting into a process you create yourself in a suspended state, then queuing the APC against the main thread before it has executed a single instruction.

// Educational pseudocode — illustrates API sequence, not payload.

STARTUPINFOA si = { sizeof(si) };
PROCESS_INFORMATION pi = { 0 };

CreateProcessA(NULL, "C:\\Windows\\System32\\notepad.exe", NULL, NULL,
               FALSE, CREATE_SUSPENDED, NULL, NULL, &si, &pi);

LPVOID remote = VirtualAllocEx(pi.hProcess, NULL, cb,
                               MEM_COMMIT | MEM_RESERVE,
                               PAGE_EXECUTE_READWRITE);
WriteProcessMemory(pi.hProcess, remote, payload, cb, NULL);

QueueUserAPC((PAPCFUNC)remote, pi.hThread, 0);

// Thread services its APC queue as part of initialization, *before*
// running the original entry point.
ResumeThread(pi.hThread);

Why it works: when a newly created thread starts, the kernel transitions into user mode through ntdll!LdrInitializeThunk, which performs internal alertable waits during loader work. Any user APC queued before ResumeThread is delivered during that early window — before the legitimate entry point runs.

This variant straddles two ATT&CK sub-techniques: it is APC injection (T1055.004) but it also resembles Thread Execution Hijacking (T1055.003) because the suspended-thread-then-redirect pattern is structurally the same primitive.


Flow diagram of the Early-Bird APC injection sequence showing CreateProcess in suspended state, memory staging, APC queuing, ResumeThread, and payload execution before the legitimate entry point
Early-Bird queues the APC before the main thread has executed a single instruction, exploiting the alertable waits inside LdrInitializeThunk.

9. Special User APCs (RS5+): Bypassing the Alertable Requirement

Starting with Windows 10 RS5, the kernel introduced Special User APCs. The key behavioural change: these APCs are delivered with Mode == KernelMode to force a thread signal. The thread is interrupted mid-execution to run the special APC — the alertable-state requirement is gone.

They are queued via NtQueueApcThreadEx (passing 1 as the reserve handle) or through NtQueueApcThreadEx2, which exposes a flags field. kernelbase!QueueUserAPC2 is the documented Win32 wrapper.

// Conceptual signatures — confirm flag values and syscall semantics
// against the target SDK / Windows build before relying on them.

typedef NTSTATUS (NTAPI *pNtQueueApcThreadEx2)(
    HANDLE         ThreadHandle,
    HANDLE         UserApcReserveHandle,   // optional reserve object
    ULONG          ApcFlags,               // e.g. QUEUE_USER_APC_FLAGS_SPECIAL_USER_APC
    PVOID          ApcRoutine,
    PVOID          SystemArgument1,
    PVOID          SystemArgument2,
    PVOID          SystemArgument3);

// Pseudocode dispatch — `Special User APC` interrupts a running thread
// without requiring it to be in SleepEx / WaitForSingleObjectEx.
pNtQueueApcThreadEx2 fn = (pNtQueueApcThreadEx2)
    GetProcAddress(GetModuleHandleW(L"ntdll.dll"), "NtQueueApcThreadEx2");

fn(hThread,
   NULL,
   QUEUE_USER_APC_FLAGS_SPECIAL_USER_APC,   // forces in-execution delivery
   remote_routine,
   NULL, NULL, NULL);

Internally the kernel sets SpecialUserApcPending (added to KAPC_STATE for this purpose) and arranges delivery at the next return-to-user-mode opportunity regardless of wait state. This is a meaningful escalation of the primitive — it converts APC injection from “wait until the thread cooperates” to “interrupt the thread now.”


10. Real-World Threat Actor Usage

APC injection is documented at the technique level rather than the family level here; defenders should treat it as a primitive that recurs across many tradecraft variants:

  • DOUBLEPULSAR used kernel-mode APC injection to redirect user-mode threads from a kernel implant.
  • Multiple commodity and APT families catalogued under MITRE T1055.004 employ classic user-APC injection against svchost.exe, explorer.exe, and other long-running hosts.
  • The AtomBombing family of injection variants combines GlobalAddAtom/NtQueueApcThread to stage code through atom tables, then dispatch via APC.
  • Recent research (Check Point’s Thread Name-Calling) chains thread-name primitives with APC dispatch to evade EDR userland hooks.

11. Common Attacker Techniques

TechniqueDescription
Classic APC InjectionOpenProcessOpenThread(THREAD_SET_CONTEXT)VirtualAllocExWriteProcessMemoryQueueUserAPC. Fires when the target thread next enters an alertable wait.
Early-Bird APCCreateProcess(CREATE_SUSPENDED) → write payload → QueueUserAPCResumeThread. APC fires during loader init, before the entry point.
Special User APCNtQueueApcThreadEx / NtQueueApcThreadEx2 with QUEUE_USER_APC_FLAGS_SPECIAL_USER_APC — interrupts the thread mid-execution; no alertable wait required.
Kernel APC injection from a driverMalicious driver calls KeInsertQueueApc directly against a user thread (DOUBLEPULSAR class). Mitigated by HVCI / driver signing.
Atom-table staged APC (AtomBombing)Payload bytes shuttled into target via atom tables, then dispatched with NtQueueApcThread. Evades naive memory-write detections.
Self-APC for unhooking / stagingQueue an APC to the current thread + SleepEx(0, TRUE) to execute code outside hooked call paths.

12. Defensive Strategies & Detection

APC injection is deliberately quiet — it does not create a remote thread and so does not emit Sysmon EID 8. Detection therefore pivots on the handle-acquisition and memory-staging stages, plus dedicated ETW.

12.1 Sysmon

Event IDNameWhy It Matters Here
EID 10ProcessAccessCaptures the OpenThread/OpenProcess step. GrantedAccess masks covering THREAD_SET_CONTEXT (0x0018) and PROCESS_VM_WRITE (0x0020) against high-value images are the strongest signal.
EID 8CreateRemoteThreadWill not fire for pure APC injection — but does fire for hybrid variants and is useful as a negative signal.
EID 1ProcessCreateDetects CREATE_SUSPENDED parent/child pairs typical of Early-Bird. Combine with short process lifetimes.

12.2 ETW — Microsoft-Windows-Threat-Intelligence

The Threat Intelligence ETW provider exposes a dedicated APC-injection sensor:

  • THREATINT_QUEUEUSERAPC_REMOTE_KERNEL_CALLER — logged by EtwTiLogInsertQueueUserApc / EtwTiLogQueueApcThread, invoked from inside KeInsertQueueApc. Introduced in Windows 10 build 1809.

Consumption requires a signed ELAM driver; the provider is reserved for AntiMalware-protected processes. In practice you receive this telemetry through your EDR vendor’s sensor.

12.3 Audit Policy

  • Enable Detailed Tracking → Audit Process Access → Security log EIDs 4656 / 4663 on handle requests. Filter for Object Type = Thread with access masks including THREAD_SET_CONTEXT.
  • Enable Audit Process Creation → EID 4688 with full command-line logging. Pair with CREATE_SUSPENDED heuristics where parent process behaviour permits inference.

12.4 Sigma Detection (Conceptual)

title: Suspicious Cross-Process Handle Acquisition Consistent With APC Injection
id: 00000000-0000-0000-0000-000000000000
status: experimental
logsource:
  product: windows
  service: sysmon
detection:
  selection_thread_ctx:
    EventID: 10
    GrantedAccess|contains:
      - '0x0018'    # THREAD_SET_CONTEXT | THREAD_GET_CONTEXT
      - '0x1fffff'  # PROCESS_ALL_ACCESS
    TargetImage|endswith:
      - '\lsass.exe'
      - '\svchost.exe'
      - '\explorer.exe'
      - '\winlogon.exe'
  selection_vm_write:
    EventID: 10
    GrantedAccess|contains: '0x0020'   # PROCESS_VM_WRITE
  timeframe: 5s
  condition: selection_thread_ctx and selection_vm_write
falsepositives:
  - Endpoint security products and legitimate debuggers
level: high

12.5 Behavioural Heuristics

The fingerprint that hunts well: VirtualAllocEx (RWX) → WriteProcessMemoryNtQueueApcThread issued by the same source process within a short window. Even when individual calls are noisy, the ordering is rare in benign software.

12.6 PowerShell — Hunt for Suspicious ProcessAccess Masks

Get-WinEvent -LogName 'Microsoft-Windows-Sysmon/Operational' -FilterXPath @"
*[System[EventID=10]]
"@ |
  Where-Object {
      $_.Properties[5].Value -match '0x0018|0x001f|0x1fffff' -and
      $_.Properties[6].Value -match 'lsass\.exe|svchost\.exe|winlogon\.exe'
  } |
  Select-Object TimeCreated,
                @{n='Source'; e={$_.Properties[4].Value}},
                @{n='Target'; e={$_.Properties[6].Value}},
                @{n='Access';e={$_.Properties[5].Value}}

12.7 Hardening

MitigationDescription
Protected Process Light (PPL)LSASS as PPL-Antimalware blocks OpenThread(THREAD_SET_CONTEXT) from untrusted callers.
Credential GuardMoves LSASS secrets into a VSM-isolated process, removing it as an APC target entirely.
HVCI / Code IntegrityPrevents unsigned kernel drivers from calling KeInsertQueueApc against arbitrary threads.
ASR rule 9e6c4e1f-7d60-472f-ba1a-a39ef669e4b0Blocks credential theft from LSASS; complements but does not directly block APC injection.
Minimize alertable waits in sensitive codeAvoid SleepEx(n, TRUE) and other alertable waits in privileged service threads unless required.
ETW-TI via EDRDeploy AV/EDR with an ELAM driver to consume Microsoft-Windows-Threat-Intelligence events in real time.

Graph diagram mapping four detection controls — Sysmon EID 10, ETW-TI, Audit EID 4656, and behavioural sequencing — plus hardening measures against the APC injection threat
Because APC injection skips CreateRemoteThread, detection pivots to handle-acquisition telemetry and dedicated ETW-TI sensors rather than Sysmon EID 8.

13. Tools for APC Analysis

ToolDescriptionLink
WinDbgWalk KTHREAD.ApcState, dump KAPC entries via !list, inspect UserApcPending.microsoft.com
Process HackerPer-thread inspection, including private RX allocations and thread call stacks indicative of injected code.processhacker.sourceforge.io
SysmonEID 10 / 8 / 1 telemetry for the handle-open and process-creation halves of the chain.sysinternals.com
Sysinternals handle.exeEnumerate handles a suspect process holds (look for foreign Thread / Process handles).sysinternals.com
Volatility 3Memory forensics: walk thread APC queues post-incident; identify injected RX regions.volatilityfoundation.org
ETW Explorer / SilkETWInspect or subscribe to ETW providers (ETW-TI requires signed ELAM).github.com
x64dbgUser-mode dynamic analysis of QueueUserAPC / RtlDispatchAPC call chains.x64dbg.com

14. MITRE ATT&CK Mapping

TechniqueMITRE IDDetection
Process InjectionT1055Behavioural sequence: cross-process handle with VM-write rights followed by APC queuing.
Process Injection: Asynchronous Procedure CallT1055.004Sysmon EID 10 with THREAD_SET_CONTEXT; ETW-TI THREATINT_QUEUEUSERAPC_REMOTE_KERNEL_CALLER.
Thread Execution HijackingT1055.003Early-Bird variant: CREATE_SUSPENDED process + THREAD_SET_CONTEXT handle + early-window APC.

T1055.004 is the primary mapping for this tutorial. The Early-Bird variant (§8) overlaps with T1055.003 because the suspended-thread + redirection structure is the same primitive — defenders should detect both.


Summary

  • APCs are a legitimate kernel facility for thread-targeted asynchronous work, and that property is exactly what makes them a first-class injection primitive.
  • The dispatch chain is QueueUserAPCNtQueueApcThreadKiInsertQueueApcKiDeliverApcntdll!RtlDispatchAPC → caller routine; every layer is observable.
  • User APCs require an alertable wait; Early-Bird sidesteps this via CREATE_SUSPENDED, and Special User APCs (NtQueueApcThreadEx2 + QUEUE_USER_APC_FLAGS_SPECIAL_USER_APC) eliminate the requirement entirely.
  • APC injection deliberately evades Sysmon EID 8 — detection pivots on EID 10 with THREAD_SET_CONTEXT (0x0018) and PROCESS_VM_WRITE (0x0020) against high-value targets, plus Microsoft-Windows-Threat-Intelligence ETW (EtwTiLogInsertQueueUserApc).
  • Map to T1055.004 for classic / special-user APC, and additionally to T1055.003 for the Early-Bird suspended-thread variant; harden with PPL, Credential Guard, HVCI, and ETW-TI-consuming EDR.

Related Tutorials

References

System Calls and SSDT: How User Mode Reaches the Kernel

Objective: Understand how Windows user-mode code transitions to ring 0 via the SYSCALL instruction, how the System Service Descriptor Table (SSDT) dispatches those calls, and why SSDT hooking, direct syscalls, and modern kernel hardening (PatchGuard, HVCI, MWTI ETW) are central to both offensive tradecraft and defensive telemetry.


1. Why System Calls Exist

User-mode code runs at CPL 3 (ring 3). The kernel runs at CPL 0 (ring 0). Privileged operations — opening another process, mapping physical pages, accessing the file system, talking to drivers — require ring 0. The CPU enforces this with segment descriptors and page-table permissions; a direct CALL into kernel memory from user mode faults immediately.

The bridge is a controlled transition: the user-mode side specifies what it wants by number, the CPU switches to ring 0 at a fixed, kernel-controlled entry point, and the kernel validates and dispatches. That number is the System Service Number (SSN), and the dispatch table is the SSDT.

This design has two consequences that drive everything in this post:

  • The kernel entry point is fixed and well-known, so an attacker who can write to ring 0 memory (a kernel rootkit) can redirect every syscall by patching one table.
  • The user-mode side of the syscall (the stub in ntdll.dll) is not privileged, so an EDR can hook it — and a red teamer can bypass that hook by issuing the SYSCALL instruction themselves.

2. The Mechanics of SYSCALL on x64

SYSCALL is a dedicated x86-64 instruction designed for fast ring-3 → ring-0 transitions. It does not use the legacy interrupt gate (int 2Eh); it reads MSRs and jumps.

MSRAddressRole
IA32_LSTAR0xC0000082Kernel RIP to jump to on SYSCALL from 64-bit user mode. Holds KiSystemCall64 (or KiSystemCall64Shadow with KPTI).
IA32_STAR0xC0000081Encodes the kernel and user CS/SS selectors for SYSCALL/SYSRET.
IA32_FMASK0xC0000084RFLAGS mask — bits cleared on entry (notably IF, masking interrupts during the prologue).

The x64 Windows syscall ABI:

  • EAX holds the SSN (the index into KiServiceTable).
  • R10 holds the first argument. The user-mode stub copies RCX into R10 because SYSCALL itself clobbers RCX with the return RIP.
  • RDX, R8, R9, then stack — match the standard x64 calling convention for the remaining arguments.

A minimal user-mode stub, exactly as ntdll lays it out:

; NtFooBar — illustrative ntdll-style syscall stub (x64)
NtFooBar:
    mov   r10, rcx          ; SYSCALL clobbers RCX; preserve arg0 in R10
    mov   eax, 0x????       ; SSN — VERSION-SPECIFIC, resolve at runtime
    syscall                 ; ring-3 -> ring-0 via LSTAR
    ret                     ; SYSRET returns here

The 32-bit predecessor was SYSENTER (with entry stored in IA32_SYSENTER_EIP). On modern 64-bit Windows, SYSENTER is only relevant inside the Wow64 path.


Flow diagram showing the sequence from user-mode code through the ntdll SYSCALL stub, CPU MSR-driven transition, KiSystemCall64 kernel entry point, SSDT dispatch, and final Nt* function execution
A single SYSCALL instruction bridges ring 3 and ring 0, with EAX carrying the SSN that indexes KiServiceTable for dispatch.

3. KiSystemCall64: The Kernel Entry Point

When the CPU executes SYSCALL from user mode:

  1. It loads RIP from IA32_LSTAR (→ KiSystemCall64).
  2. It loads CS/SS from IA32_STAR (kernel selectors).
  3. It saves the old user RIP in RCX and old RFLAGS in R11.
  4. It clears RFLAGS bits per IA32_FMASK.

KiSystemCall64 then:

  • Swaps GS via SWAPGS to access the per-CPU KPCR.
  • Switches from the user stack to the kernel stack stored in the KPCR.
  • Builds a KTRAP_FRAME capturing the user context.
  • Indexes KeServiceDescriptorTable (or the Shadow variant for Win32k GUI calls) using EAX.
  • Calls the resolved Nt* function.
  • On return, restores the frame and executes SYSRET to drop back to ring 3.

Selected KTRAP_FRAME fields (see WDK wdm.h for the full layout):

FieldDescription
RipSaved user-mode instruction pointer (from RCX at entry).
RspSaved user-mode stack pointer.
EFlagsSaved RFLAGS (from R11).
ErrCodeProcessor error code; 0 for syscalls.

With Kernel Page-Table Isolation (KPTI) active, IA32_LSTAR points instead at KiSystemCall64Shadow, a thin trampoline that swaps from the user CR3 (which maps only a minimal kernel trampoline) to the full kernel CR3 before falling through into the normal dispatcher. This is the Meltdown mitigation.


4. The SSDT and KSERVICE_TABLE_DESCRIPTOR

The “SSDT” in casual use refers to two related objects:

SymbolDescription
KeServiceDescriptorTableExported KSERVICE_TABLE_DESCRIPTOR. Covers the core Nt* services in ntoskrnl.exe.
KeServiceDescriptorTableShadowNot exported. Adds a second entry for win32k!W32pServiceTable — the GUI/USER/GDI syscall surface. Rootkits historically located it by pattern scanning around KeAddSystemServiceTable or via debugger symbols.
KiServiceTableThe actual function-pointer table referenced by the descriptor.
KiArgumentTableParallel array of argument byte counts per service.

Approximate layout from public symbols:

typedef struct _KSERVICE_TABLE_DESCRIPTOR {
    PULONG_PTR ServiceTable;   // -> KiServiceTable (encoded offsets on x64)
    PULONG     CounterTable;   // call counters (typically NULL in retail)
    ULONG      TableSize;      // number of services
    PUCHAR     ArgumentTable;  // bytes of stack args per service
} KSERVICE_TABLE_DESCRIPTOR, *PKSERVICE_TABLE_DESCRIPTOR;

The SSN (EAX) is split: the low 12 bits index the table, and bit 12 selects which descriptor — 0 for KeServiceDescriptorTable, 1 for the Win32k shadow table. This is how GUI syscalls (NtUserCreateWindowEx, NtGdiBitBlt, …) coexist with kernel-proper syscalls in the same SSN space.


Hierarchy diagram showing KeServiceDescriptorTable splitting into the core NT KiServiceTable and the Win32k shadow table, with EAX bit 12 selecting the descriptor and low 12 bits indexing into it
EAX bit 12 routes GUI syscalls to the Win32k shadow table while bits 11–0 index the specific service within the selected descriptor.

5. The x64 Encoded-Offset Format

A critical detail anyone writing an SSDT scanner gets wrong the first time: on x64 Windows, KiServiceTable entries are not function pointers. Each entry is a 32-bit value encoding a signed offset from the base of KiServiceTable itself, with the low 4 bits used to communicate the argument-count category to the dispatcher.

The decode is:

// Recover the real Nt* function address from KiServiceTable[i]
ULONG_PTR DecodeSsdtEntry(PULONG ServiceTable, ULONG index)
{
    LONG  encoded = (LONG)ServiceTable[index];     // signed 32-bit
    LONG  offset  = encoded >> 4;                  // arithmetic shift
    return (ULONG_PTR)ServiceTable + offset;       // base + offset
}

The arithmetic right shift matters — it preserves the sign, allowing functions located before KiServiceTable in memory to be addressed. A naive unsigned >> 4 will silently miss those entries and produce a corrupt scanner.


6. Tracing a Syscall End-to-End: NtOpenProcess

Following an OpenProcess call from a user-mode debugger target:

kernel32!OpenProcess
   └─> kernelbase!OpenProcess
        └─> ntdll!NtOpenProcess         ; the syscall stub
              mov  r10, rcx
              mov  eax, <SSN>           ; version-specific
              syscall
              ret
            ─────────── ring 3 / ring 0 boundary ───────────
            CPU: RIP <- LSTAR (KiSystemCall64[Shadow])
        nt!KiSystemCall64
          ├─ SWAPGS, switch to kernel stack
          ├─ build KTRAP_FRAME
          ├─ idx = EAX & 0xFFF
          ├─ desc = (EAX & 0x1000) ? Shadow : KeServiceDescriptorTable
          ├─ fn  = desc->ServiceTable + (desc->ServiceTable[idx] >> 4)
          └─ call nt!NtOpenProcess
                nt!NtOpenProcess
                  ├─ ObReferenceObjectByName / ByHandle
                  ├─ SeAccessCheck (DesiredAccess vs token)
                  └─ ObOpenObjectByPointer -> HANDLE
            SYSRET back to user-mode RIP saved in RCX

The SSN for NtOpenProcess changes between Windows builds; never hardcode it. Tooling either resolves it from the on-disk ntdll.dll, parses the in-memory stub, or consults a versioned table such as j00ru’s syscall reference.

A practical SSN extractor parses the Nt* export’s first instructions and reads the MOV EAX, imm32 (B8 xx xx xx xx) byte pattern:

# Parse SSNs from a clean on-disk ntdll.dll (illustrative)
import pefile, struct

pe = pefile.PE(r"C:\Windows\System32\ntdll.dll", fast_load=False)
pe.parse_data_directories()
image = pe.get_memory_mapped_image()

for exp in pe.DIRECTORY_ENTRY_EXPORT.symbols:
    name = exp.name.decode() if exp.name else ""
    if not name.startswith("Nt"):
        continue
    stub = image[exp.address: exp.address + 24]
    # Classic stub: 4C 8B D1  B8 ss ss 00 00  F6 04 25 ...  0F 05  C3
    if stub[0:3] == b"\x4c\x8b\xd1" and stub[3] == 0xB8:
        ssn = struct.unpack("<I", stub[4:8])[0]
        print(f"{name:40s} SSN=0x{ssn:04x}")

Red-team loaders use the same idea at runtime — sometimes against a fresh copy of ntdll read from disk to defeat in-memory EDR hooks (the “Perun’s Fart” / fresh-copy pattern).


7. Wow64 and Heaven’s Gate

A 32-bit process on 64-bit Windows still ultimately issues a 64-bit SYSCALL, because the only kernel entry the CPU honors from a 64-bit process is KiSystemCall64. The Wow64 layer bridges this:

32-bit app -> wow64cpu!CpupReturnFromSimulatedCode
           -> far jmp 0x33:<addr>          ; CS=0x23 (32-bit) -> CS=0x33 (64-bit)
           -> wow64.dll / 64-bit ntdll
           -> SYSCALL

The 0x33 / 0x23 CS selector switch is the so-called Heaven’s Gate (community label, not an official Microsoft term). Malware abuses it to:

  • Execute 64-bit shellcode from a process that defenders are monitoring as a 32-bit target.
  • Issue syscalls that bypass 32-bit ntdll hooks if the EDR only instruments the Wow64 layer.

Analysts should treat any unexpected far jmp to CS=0x33 in 32-bit code as a strong IOC.


8. SSDT Hooking: The Classic Rootkit Technique

Pre-Vista x64, kernel rootkits manipulated KiServiceTable directly:

  1. Locate the descriptor (KeServiceDescriptorTable is exported; the Shadow descriptor was pattern-scanned).
  2. Disable write protection (clear CR0.WP) or remap the page as writable.
  3. Save the original entry for the target SSN (e.g., NtQueryDirectoryFile, NtEnumerateValueKey).
  4. Overwrite the entry with a pointer to attacker code.
  5. The hook calls the original after filtering results — hiding files, registry keys, processes, or network connections.

The illustrative read-only inspection (do not modify) inside a signed test driver:

extern PKSERVICE_TABLE_DESCRIPTOR KeServiceDescriptorTable;

VOID DumpSsdtSizeAndSample(VOID)
{
    PKSERVICE_TABLE_DESCRIPTOR d = KeServiceDescriptorTable;
    PULONG table = (PULONG)d->ServiceTable;

    DbgPrint("[SSDT] TableSize = %lu\n", d->TableSize);

    for (ULONG i = 0; i < 4 && i < d->TableSize; i++) {
        LONG      enc  = (LONG)table[i];
        ULONG_PTR addr = (ULONG_PTR)table + (enc >> 4);
        DbgPrint("[SSDT] [%lu] encoded=0x%08x -> 0x%p\n", i, enc, (PVOID)addr);
    }
}

// Reading LSTAR to confirm KiSystemCall64[Shadow]
VOID DumpLstar(VOID)
{
    ULONG64 lstar = __readmsr(0xC0000082);
    DbgPrint("[MSR] IA32_LSTAR = 0x%llx (KiSystemCall64[Shadow])\n", lstar);
}

Live inspection from WinDbg on a kernel-debugged target:

0: kd> dt nt!_KSERVICE_TABLE_DESCRIPTOR nt!KeServiceDescriptorTable
0: kd> dq  nt!KeServiceDescriptorTable L4
0: kd> dd  nt!KiServiceTable L20
0: kd> u   poi(nt!KiServiceTable) L5
0: kd> rdmsr c0000082

9. PatchGuard (KPP) and Why SSDT Hooking Died

Since x64 Vista, Kernel Patch Protection periodically validates a set of protected structures, including KiServiceTable, IDT, GDT, MSR_LSTAR, kernel image code sections, and several driver objects. On mismatch, KPP issues bugcheck 0x109 — CRITICAL_STRUCTURE_CORRUPTION. The checks run from randomized timers and contexts to resist disablement.

The practical result:

  • SSDT hooking is no longer a viable persistence or hiding primitive on supported 64-bit Windows. Any survival window is short and ends in a BSOD.
  • Modern kernel-mode attackers use driver callbacks (PsSetCreateProcessNotifyRoutine, ObRegisterCallbacks, minifilters) rather than SSDT patching, because those are the supported extension points and are not policed by KPP.
  • With HVCI/Memory Integrity enabled, even loading the malicious driver is gated: kernel pages cannot be both writable and executable, and unsigned kernel code cannot enter ring 0 at all. The hypervisor enforces this at the EPT level — PatchGuard becomes a second line, not the first.

10. Direct and Indirect Syscalls (Modern Red Team TTPs)

Because KPP closed the kernel-side door, evasion moved into user mode. Many EDRs hook the Nt* stubs in ntdll.dll by overwriting the first bytes with a JMP into their inspection DLL. Two techniques bypass that:

  • Direct syscalls. The loader embeds its own mov eax, ssn; syscall; ret stub in attacker memory and calls it instead of ntdll!NtXxx. The hooked ntdll is never touched. SSNs are resolved at runtime (parsing ntdll, sorting Nt* exports by address — the “Hell’s Gate” / “Halo’s Gate” patterns).
  • Indirect syscalls. The mov eax, ssn happens in attacker memory, but the syscall instruction itself is reached by jumping to the syscall byte sequence inside ntdll.dll. The kernel-side return address therefore points back into ntdll, matching what legitimate code looks like in stack-walk telemetry.

The detection signal flips between the two:

TechniqueWhat it bypassesWhat still sees it
Direct syscallntdll user-mode hooksStack walk shows syscall from unbacked / private memory.
Indirect syscallntdll hooks and naive stack-walk checksKernel ETW (Microsoft-Windows-Threat-Intelligence) sees the syscall regardless of where it was issued from.

ETW-TI is the answer to indirect syscalls: it fires from inside the kernel dispatcher, after the SYSCALL has already landed in KiSystemCall64, so the user-mode evasion is irrelevant.


Graph diagram contrasting direct and indirect syscall evasion paths against EDR user-mode hooks, Sysmon CallTrace detection, and kernel-level ETW-TI telemetry firing after the syscall transition
Direct syscalls skip ntdll entirely while indirect syscalls camouflage the return address; ETW-TI catches both because it fires inside the kernel after the ring transition.

11. Common Attacker Techniques

TechniqueDescription
SSDT hook (legacy)Overwrite KiServiceTable[SSN] to filter results for hiding rootkit artifacts; killed by PatchGuard on x64.
Shadow SSDT hookSame against W32pServiceTable to intercept GUI/keyboard/clipboard syscalls.
Direct syscall stubEmbedded mov eax, ssn; syscall in attacker memory to bypass ntdll hooks.
Indirect syscallJump to the syscall gadget inside ntdll so call stacks look legitimate.
Hell’s Gate / Halo’s GateRuntime SSN resolution by parsing/sorting Nt* exports in mapped ntdll.
Fresh-copy ntdllRead clean ntdll.dll from disk to re-derive unhooked stubs and SSNs.
Heaven’s GateFar jump from 32-bit (CS=0x23) to 64-bit (CS=0x33) to execute 64-bit syscalls from a Wow64 process.
Driver-based hookingWhere HVCI is off, signed-but-vulnerable drivers (“BYOVD”) are used to write to MSRs or protected pages.

12. Defensive Strategies & Detection

The detection model has shifted from “watch the SSDT” (PatchGuard already does that) to watch how syscalls are issued from user mode and consume kernel ETW.

Sysmon

Event IDFieldWhy it matters
1ParentImage, CommandLineBaseline; correlates injection target lineage.
10GrantedAccess, CallTraceThe CallTrace field is the primary direct-syscall tell — legitimate stacks contain ntdll.dll; direct syscalls show UNKNOWN(...) or RWX private memory regions.
25Process image tampering / hollowing.

Sigma — direct-syscall NtOpenProcess against LSASS

title: Process Access to LSASS via Direct Syscall (Unbacked Call Stack)
id: 8d0c2a4e-syscall-lsass-unbacked
status: experimental
logsource:
  product: windows
  service: sysmon
detection:
  selection:
    EventID: 10
    TargetImage|endswith: '\lsass.exe'
    GrantedAccess:
      - '0x1010'
      - '0x1410'
      - '0x1fffff'
  unbacked:
    CallTrace|contains:
      - 'UNKNOWN'
      - 'UNKNOWN('
  filter_legit:
    SourceImage|endswith:
      - '\MsMpEng.exe'
      - '\MsSense.exe'
  condition: selection and unbacked and not filter_legit
level: high
tags:
  - attack.credential_access
  - attack.t1003.001
  - attack.t1106

ETW Providers Worth Subscribing To

ProviderUse
Microsoft-Windows-Threat-IntelligenceKernel ETW provider exposing AllocVm, ProtectVm, MapViewOfSection, ReadVm/WriteVm events. Fires from inside the kernel dispatcher, so direct and indirect syscalls are still visible. Consumer must run as PPL.
Microsoft-Windows-Kernel-ProcessProcess and thread creation, image loads.
Microsoft-Windows-Kernel-Audit-API-CallsAudits selected Nt API calls (verify against current SDK).

Audit Policy

  • Audit Sensitive Privilege Use — catches SeDebugPrivilege enabling, a near-universal precursor to syscall-based cross-process injection.
  • Audit Process Creation with command-line capture.
  • Audit Handle Manipulation with object SACLs on lsass.exe.

Hardening

  • HVCI / Memory Integrity — single highest-value control. Blocks unsigned and W^X-violating kernel code; defeats BYOVD primitives that try to disable PatchGuard, patch the SSDT, or clear CR0.WP.
  • VBS + Credential Guard — keeps LSASS secrets off the path even if a syscall reaches NtOpenProcess.
  • KPTI — Meltdown mitigation; also implies KiSystemCall64Shadow is the LSTAR target.
  • Driver Signature Enforcement + Microsoft vulnerable-driver blocklist — limits BYOVD options.
  • EDR ntdll instrumentation — still valuable as a low-cost filter against commodity malware; layer with kernel ETW for the sophisticated cases.

13. Tools for Syscall and SSDT Analysis

ToolDescriptionLink
WinDbgKernel debugger; resolves nt!KeServiceDescriptorTable, nt!KiServiceTable, reads MSRs via rdmsr.learn.microsoft.com
Process HackerLive handle, thread, and module inspection; surfaces RWX private memory regions.processhacker.sourceforge.io
Process MonitorBoot-time and runtime Nt* activity captured via minifilter.learn.microsoft.com
SysmonView / SysmonEID 10 CallTrace, EID 25 telemetry.learn.microsoft.com
HollowsHunter / pe-sieveDetects unbacked / hollowed / patched modules — strong correlator for direct-syscall loaders.github.com/hasherezade
SwishDbgExtWinDbg extension with SSDT dumping and decode of the encoded-offset format.github.com
Volatility 3Memory forensics; windows.ssdt plugin walks the descriptor and decodes entries.volatilityfoundation.org
j00ru syscall tablesAuthoritative per-version SSN reference.j00ru.vexillium.org
SilkETW / SealighterTIUser-friendly consumers for ETW providers including Microsoft-Windows-Threat-Intelligence.github.com

14. MITRE ATT&CK Mapping

TechniqueMITRE IDDetection
Native APIT1106EID 10 CallTrace containing UNKNOWN; ETW-TI AllocVm/ProtectVm from unbacked memory.
Process InjectionT1055Cross-process NtAllocateVirtualMemory + NtWriteVirtualMemory + NtCreateThreadEx chain via ETW-TI.
DLL InjectionT1055.001EID 7/8 plus ETW-TI write/protect events into a remote PID.
PE InjectionT1055.002RWX private allocations followed by remote thread creation.
Process HollowingT1055.012NtUnmapViewOfSection followed by NtWriteVirtualMemory into the primary image base.
RootkitT1014PatchGuard 0x109 bugchecks; SSDT integrity scans in memory forensics.
Impair Defenses: Disable/Modify ToolsT1562.001Driver loads with revoked or vulnerable signatures; HVCI/DSE violations.

Summary

  • Every Windows syscall is a SYSCALL instruction that lands at KiSystemCall64 via MSR_LSTAR and is dispatched through KiServiceTable using the EAX SSN.
  • The SSDT on x64 stores encoded offsets, not raw pointers — base + (entry >> 4) — and the EAX bit 12 selects between the core and Win32k Shadow tables.
  • PatchGuard killed SSDT hooking on x64; modern offense has moved to direct and indirect syscalls in user mode and to BYOVD when ring 0 is required.
  • HVCI/VBS is the strongest defense against the kernel half; kernel ETW (Microsoft-Windows-Threat-Intelligence) is the strongest defense against direct/indirect syscalls because it fires after the transition.
  • Detect with Sysmon EID 10 CallTrace (unbacked memory in the stack), enrich with ETW-TI, and map to MITRE T1106 / T1055 for response.

Related Tutorials

References