Access Tokens and Privileges: The Kernel’s Security Context

Run whoami /priv on an admin shell. You’ll see a column labeled State, and most of the entries — including SeDebugPrivilege and SeImpersonatePrivilege — read Disabled. They aren’t missing. They’re sitting in the token, dormant, waiting for a BOOL flip. That single column is the entire story of most Windows post-exploitation tradecraft in one place: not forging anything, just enabling what was already issued.

Objective: Understand how Windows builds and enforces a per-process security context through the access token, how the Security Reference Monitor uses that token on every object access, and which token operations defenders need to see to catch impersonation, theft, and privilege enablement.


1. Why Tokens Exist

When you authenticate, LSASS (lsass.exe) creates a logon session, derives a primary access token from that session, and hands it to whatever process is being started for you — userinit.exe, then explorer.exe. From that point forward, every kernel object you touch — files, registry keys, named pipes, processes, threads — is evaluated against that token by the Security Reference Monitor (SRM).

The SRM lives in the kernel and does one job: when a thread asks for access to an object, compare the thread’s effective token to the object’s security descriptor and return a yes/no. That comparison happens in SeAccessCheck (kernel) and is surfaced to user mode as AccessCheck. The order matters — Integrity Level check → DACL check → Privilege check.

Without a token, the kernel has no answer to “who is this thread, and what is it allowed to do?” Tokens aren’t a wrapper around credentials. They are the runtime identity.

Flow diagram showing LSASS authentication creating a logon session, deriving a primary token, attaching it to a process, and the Security Reference Monitor performing SeAccessCheck in order: Integrity Level, DACL, Privilege.
From authentication to access decision: the primary token is the runtime identity the SRM consults on every object request.

2. Inside nt!_TOKEN

The kernel object is nt!_TOKEN. It’s undocumented — Microsoft exposes Win32 wrappers, not field layouts — but you can inspect it on your own build:

0: kd> dt nt!_TOKEN

The layout shifts between Windows versions, so never hardcode offsets. The fields that matter conceptually are stable:

FieldPurpose
TokenIdLUID uniquely identifying this token instance
AuthenticationIdLUID of the originating logon session
TokenTypeTokenPrimary (1) or TokenImpersonation (2)
ImpersonationLevelOnly meaningful for impersonation tokens
UserAndGroupsArray of SID_AND_ATTRIBUTES — user SID plus group SIDs
PrivilegesSEP_TOKEN_PRIVILEGES — three 64-bit privilege bitmasks
IntegrityLevelIndexIndex into UserAndGroups pointing at the mandatory label
LogonSessionPointer to SEP_LOGON_SESSION_REFERENCES
DefaultDaclDACL applied to objects this token creates
SessionIdRDP / Terminal Services session ID

The Privileges member is worth dwelling on. SEP_TOKEN_PRIVILEGES carries three 64-bit bitmasks — Present, Enabled, and EnabledByDefault — and that three-state design is the entire reason “privilege escalation” can be a one-API-call affair (covered in §6). This layout is community-observed via WinDbg and ReactOS source; treat it as undocumented and verify on your target build.

Hierarchy diagram of the nt!_TOKEN kernel structure, branching into Identity fields, Type and Impersonation Level, UserAndGroups SID array, SEP_TOKEN_PRIVILEGES with three bitmasks, Integrity Level index, and Logon Session pointer.
The nt!_TOKEN structure: the three-bitmask SEP_TOKEN_PRIVILEGES field (Present, Enabled, EnabledByDefault) is the mechanism behind most privilege-escalation tradecraft.

3. Primary vs. Impersonation Tokens

Every process has exactly one primary token, set at CreateProcess time and fixed for the lifetime of the process. You don’t swap it. To run code under a different identity, you start a new process with a different token (CreateProcessAsUser, CreateProcessWithTokenW).

Threads are different. A thread can carry an impersonation token that temporarily overrides the process’s primary token for that thread only. This is how RPC servers, named-pipe servers, and IIS worker threads handle requests on behalf of multiple callers without spawning a process each time. The kernel keeps it in _KTHREAD.ImpersonationInfo; SeAccessCheck prefers the thread token over the process token if one is present.

The distinction matters at detection time too. OpenProcessToken returns the primary token; OpenThreadToken returns the impersonation token, if any. A thread calling OpenThreadToken and getting ERROR_NO_TOKEN is normal — most threads aren’t impersonating. A thread calling it and getting SYSTEM is not.

Graph diagram contrasting a process primary token stored in _EPROCESS with a per-thread impersonation token stored in _KTHREAD.ImpersonationInfo, showing the SRM preferring the thread token when present.
The SRM always prefers a thread’s impersonation token over the process primary token, making per-thread identity the key primitive for RPC and pipe servers.

4. Integrity Levels and Mandatory Integrity Control

Mandatory Integrity Control (MIC) added a sideband label to the token and a corresponding mandatory label ACE in object SACLs. Five well-known integrity SIDs cover the practical range:

SIDLevelTypical Use
S-1-16-0UntrustedHeavily sandboxed code
S-1-16-4096LowBrowser renderers, AppContainer
S-1-16-8192MediumDefault for interactive user processes
S-1-16-12288HighElevated (post-UAC) admin processes
S-1-16-16384SystemSYSTEM-account services and kernel components

The label sits in UserAndGroups at index IntegrityLevelIndex, retrievable from user mode via GetTokenInformation(..., TokenIntegrityLevel, ...) into a TOKEN_MANDATORY_LABEL. MIC’s enforcement rule is simple: a process at a lower integrity level cannot write to or modify a higher-integrity object belonging to the same user — no DLL injection, no token impersonation up the chain. That single rule is what stops a Medium-IL Word process from injecting into a High-IL elevated PowerShell.

5. Reading a Token from User Mode

The minimum useful query: open the token, ask for the user SID, print it.

HANDLE hToken = NULL;
if (!OpenProcessToken(GetCurrentProcess(), TOKEN_QUERY, &hToken)) {
    return GetLastError();
}

DWORD cbUser = 0;
GetTokenInformation(hToken, TokenUser, NULL, 0, &cbUser);
PTOKEN_USER pUser = (PTOKEN_USER)LocalAlloc(LPTR, cbUser);

if (GetTokenInformation(hToken, TokenUser, pUser, cbUser, &cbUser)) {
    LPWSTR sidStr = NULL;
    ConvertSidToStringSidW(pUser->User.Sid, &sidStr);
    wprintf(L"User SID: %s\n", sidStr);
    LocalFree(sidStr);
}

LocalFree(pUser);
CloseHandle(hToken);

The same GetTokenInformation call with TokenGroups returns a TOKEN_GROUPS you can walk to see which groups are SE_GROUP_ENABLED, SE_GROUP_MANDATORY, or SE_GROUP_INTEGRITY (that last flag is how you find the IL label without parsing the index). TokenPrivileges returns a TOKEN_PRIVILEGES and feeds the next section.

For integrity level specifically:

DWORD cb = 0;
GetTokenInformation(hToken, TokenIntegrityLevel, NULL, 0, &cb);
PTOKEN_MANDATORY_LABEL pLabel = (PTOKEN_MANDATORY_LABEL)LocalAlloc(LPTR, cb);
GetTokenInformation(hToken, TokenIntegrityLevel, pLabel, cb, &cb);

DWORD rid = *GetSidSubAuthority(
    pLabel->Label.Sid,
    (DWORD)(UCHAR)(*GetSidSubAuthorityCount(pLabel->Label.Sid) - 1));

// rid == 0x2000 (8192)  -> Medium
// rid == 0x3000 (12288) -> High
// rid == 0x4000 (16384) -> System

6. Privileges: Present, Enabled, Removed

A privilege has three independent states inside the token:

  • Present — the privilege exists in the token. Cannot be added at runtime by user mode.
  • Enabled — the privilege is currently active for access checks.
  • Removed — once a privilege is removed via SE_PRIVILEGE_REMOVED, it’s gone for the life of the token.

AdjustTokenPrivileges only moves a privilege between “present and disabled” and “present and enabled.” It cannot grant a privilege the token never had. So when a tool “enables SeDebugPrivilege,” it isn’t gaining authority — that authority was issued at logon and waiting in the Present bitmask. The enable is purely a flag flip.

HANDLE hToken;
LUID  luid;
TOKEN_PRIVILEGES tp = {0};

OpenProcessToken(GetCurrentProcess(),
                 TOKEN_ADJUST_PRIVILEGES | TOKEN_QUERY,
                 &hToken);

LookupPrivilegeValueW(NULL, SE_DEBUG_NAME, &luid);

tp.PrivilegeCount           = 1;
tp.Privileges[0].Luid       = luid;
tp.Privileges[0].Attributes = SE_PRIVILEGE_ENABLED;

AdjustTokenPrivileges(hToken, FALSE, &tp, sizeof(tp), NULL, NULL);

if (GetLastError() == ERROR_NOT_ALL_ASSIGNED) {
    // Privilege wasn't Present in the token -> not actually enabled.
}

That ERROR_NOT_ALL_ASSIGNED check is the gotcha most first-timers miss: AdjustTokenPrivileges returns TRUE even when the privilege isn’t in Present. The real outcome is only visible through GetLastError. I’ve burned a solid afternoon staring at a “successful” call that did nothing because the calling process was unelevated and SeDebugPrivilege was never issued in the first place.

The privileges worth keeping at the top of a defender’s list:

PrivilegeWhy It Matters
SeDebugPrivilegeOpen any process, including LSASS, for read/write
SeImpersonatePrivilegePrecondition for the Potato family of escalations
SeAssignPrimaryTokenPrivilegeReplace a process’s primary token
SeTcbPrivilege“Act as part of the OS” — essentially unrestricted
SeLoadDriverPrivilegeLoad arbitrary kernel drivers → BYOVD
SeBackupPrivilege / SeRestorePrivilegeRead/write any file regardless of DACL
SeTakeOwnershipPrivilegeSeize ownership of any object
SeCreateTokenPrivilegeForge tokens directly — held only by SYSTEM

7. Impersonation in Depth

SECURITY_IMPERSONATION_LEVEL defines how far the impersonating thread can act on behalf of the original principal:

LevelMeaning
SecurityAnonymousServer cannot identify or impersonate the client
SecurityIdentificationServer can identify but not act as the client
SecurityImpersonationServer can act as the client on the local machine
SecurityDelegationServer can act as the client on local and remote systems

The canonical sequence for a service impersonating a caller:

HANDLE hClient;
DuplicateTokenEx(hSourceToken,
                 TOKEN_ALL_ACCESS,
                 NULL,
                 SecurityImpersonation,
                 TokenImpersonation,
                 &hClient);

SetThreadToken(NULL, hClient);   // current thread now runs as the client
// ... perform the work that requires the client's identity ...
RevertToSelf();                  // back to the process's primary token
CloseHandle(hClient);

SECURITY_QUALITY_OF_SERVICE controls whether impersonation tracks the source statically or dynamically, and whether only the enabled privileges follow (EffectiveOnly). That last flag is one of the more interesting defensive levers — a service calling impersonation with EffectiveOnly = TRUE strips dormant privileges out of the impersonation context entirely.

8. Duplication, LogonUser, and Process Creation Under a Token

Three primitives cover most of the “run something as someone else” surface:

  • DuplicateTokenEx — clone an existing token, optionally upgrading from impersonation to primary type. Requires TOKEN_DUPLICATE on the source.
  • LogonUser — authenticate a username/password and receive a fresh primary token tied to a new logon session.
  • CreateProcessWithTokenW — start a new process whose primary token is the one you pass in. Requires SeImpersonatePrivilege on the caller.

The MITRE taxonomy splits the abuse cleanly along these primitives:

  • T1134.001 — Token Impersonation/Theft. OpenProcessToken against a higher-privileged process, DuplicateTokenEx, then ImpersonateLoggedOnUser or SetThreadToken. No credentials needed; you steal what’s already running.
  • T1134.002 — Create Process with Token. Same theft, but you go straight to CreateProcessWithTokenW to start a new process under the stolen identity rather than impersonating on a thread.
  • T1134.003 — Make and Impersonate Token. LogonUser with credentials in hand, then SetThreadToken. Quieter than theft because the resulting logon looks legitimate — but it generates a 4624 you can see.
Flow diagram mapping token abuse primitives: OpenProcessToken feeding DuplicateTokenEx which branches to thread impersonation (T1134.001) or CreateProcessWithTokenW (T1134.002), and LogonUser feeding SetThreadToken (T1134.003).
The three MITRE T1134 sub-techniques map directly onto three token API primitives — theft via duplication, new process under stolen token, or fresh token from explicit credentials.

9. _EPROCESS.Token and Kernel-Mode Abuse

The kernel’s view of a process’s primary token is the Token field in _EPROCESS, an EX_FAST_REF — a pointer with reference-count bits packed into the low bits. A kernel exploit with arbitrary write can overwrite that field with a pointer to the SYSTEM process’s token, instantly upgrading the attacker’s process to SYSTEM without touching any user-mode API.

Walking it in WinDbg looks like this:

0: kd> !process 0 0 explorer.exe
PROCESS ffffba0c1a5f6080 ...
0: kd> dt nt!_EPROCESS ffffba0c1a5f6080 Token
   +0x4b8 Token : _EX_FAST_REF
0: kd> dt nt!_TOKEN (poi(ffffba0c1a5f6080+0x4b8) & ~0xf)

The offset will not be 0x4b8 on your build. Use dt to find it on the system you’re analyzing.

For defenders, the operational takeaway is that kernel-mode token swapping leaves no user-mode footprint — no AdjustTokenPrivileges, no OpenProcessToken, no 4703. The detection has to shift earlier: catch the driver load (SeLoadDriverPrivilege use, signed-driver loader events) or the exploit’s user-mode loader, because by the time the swap happens your audit pipeline is blind to it.


10. Detection and Defense

Token abuse leaves observable traces across the Security log, Sysmon, and ETW. Pick the events that match the primitive you’re hunting.

Windows Security Audit Events

Event IDNameWhat It Tells You
4624Successful logonNew logon session and primary token; check LogonType
4648Logon with explicit credentialsrunas, CreateProcessWithLogonW, lateral movement
4672Special privileges assigned to new logonSensitive privileges granted at session start
4673Privileged service calledUse of sensitive privilege
4688New process createdIncludes TokenElevationType (1/2/3)
4703User right adjustedAdjustTokenPrivileges calls — the core privilege-enable signal

4672 is high-value: it fires once per privileged logon and lists the sensitive privileges assigned. Filter out the well-known principals (LOCAL SYSTEM, NETWORK SERVICE, LOCAL SERVICE) and expected admins. What’s left is worth a look — that’s where Mimikatz-style pass-the-hash and elevation activity surfaces.

Sysmon

  • EID 1 (Process Create)IntegrityLevel and User fields directly show the process’s effective token. A child of a Medium-IL process suddenly running at System integrity is a hard signal.
  • EID 10 (ProcessAccess)OpenProcess against LSASS or other high-value targets. Watch GrantedAccess masks like 0x1400 (PROCESS_QUERY_INFORMATION | PROCESS_QUERY_LIMITED_INFORMATION) and 0x40 (PROCESS_DUP_HANDLE).
  • EID 8 (CreateRemoteThread) — cross-process injection that frequently follows token theft.

Sigma Sketch: Privilege Enable on a Sensitive Right

title: Sensitive Privilege Adjusted via AdjustTokenPrivileges
logsource:
  product: windows
  service: security
detection:
  selection:
    EventID: 4703
    EnabledPrivilegeList|contains:
      - 'SeDebugPrivilege'
      - 'SeImpersonatePrivilege'
      - 'SeTcbPrivilege'
      - 'SeLoadDriverPrivilege'
  filter_known:
    SubjectUserSid:
      - 'S-1-5-18'   # LOCAL SYSTEM
      - 'S-1-5-19'   # LOCAL SERVICE
      - 'S-1-5-20'   # NETWORK SERVICE
  condition: selection and not filter_known
level: high

To produce 4703, the Audit Token Right Adjusted subcategory has to be enabled — it isn’t by default on most builds. Same goes for Audit Sensitive Privilege Use for 4673/4674, and command-line logging in 4688 (Group Policy: System → Audit Process Creation → Include command line).

ETW Providers

ProviderWhat It Carries
Microsoft-Windows-Security-AuditingAll audit events above
Microsoft-Windows-Kernel-ProcessProcess/thread lifecycle including token assignment
Microsoft-Windows-Threat-IntelligenceHigh-fidelity process-access telemetry; PPL consumer only (Defender/EDR)

Hardening

  • SeCreateTokenPrivilege → SYSTEM only. Nothing else needs it.
  • SeAssignPrimaryTokenPrivilege → local/network service accounts only. Audit anything else holding it.
  • Strip SeImpersonatePrivilege from service accounts that don’t host RPC or named-pipe endpoints. Its presence is the precondition for the Potato family.
  • PPL for critical services — blocks OpenProcess with token-access rights from unprotected callers.
  • Credential Guard — isolates logon-session secrets in VSM,

Related Tutorials

References

SIDs and Security Descriptors: Identity in Windows Security

A thread opens a handle to a file. Before a single byte is read, the kernel has already answered a question nobody typed: is the caller’s identity allowed to do this? That answer lives at the intersection of two structures — the SID that names who you are, and the security descriptor that says who gets in. Get the relationship between them wrong and you ship a world-writable service. Understand it, and most “weird permission” incidents stop being mysterious.

Objective: Understand how Windows represents identity with Security Identifiers, how Security Descriptors bind owners, DACLs, and SACLs to every securable object, and how attackers abuse — and defenders detect — manipulation of both.


1. Identity Before Access

Windows authenticates security principals — anything the OS can prove an identity for: users, groups, computers, and service accounts. Authentication is the LSA’s job; the SAM (local) or the domain’s NTDS.dit (Active Directory) stores the account records. But authentication only proves who you are. Authorization — what you may touch — is a separate decision made against a different value: the SID.

A SID is the canonical, machine-readable name for a principal. Display names change. SAM account names get reused. SIDs do not. Once the system mints a SID at account-creation time, that value is never reused to identify another principal, even after the account is deleted. Every authorization check in the OS compares SIDs, never names.


2. Anatomy of a SID

A SID is a variable-length binary structure, defined as SID in winnt.h. Three logical parts: a revision, the issuing authority, and a chain of sub-authorities ending in a Relative Identifier (RID).

FieldTypeMeaning
RevisionBYTESID structure version — always 1
SubAuthorityCountBYTENumber of sub-authority values (max 15)
IdentifierAuthoritySID_IDENTIFIER_AUTHORITY6-byte top-level authority that issued the SID
SubAuthority[]DWORD[]Sub-authority values; the last element is the RID

The string notation everyone recognizes is just those fields, hyphenated. Take S-1-5-21-<d1>-<d2>-<d3>-513:

  • S-1 — a revision-1 SID.
  • 5SECURITY_NT_AUTHORITY, marking it a Windows NT SID.
  • 21SECURITY_NT_NON_UNIQUE, signaling that a domain identifier follows.
  • <d1>-<d2>-<d3> — three 32-bit values randomly generated to uniquely identify the domain.
  • 513 — the RID; here, the well-known RID for Domain Users.

You rarely build SIDs by hand. You parse them. Here’s the field-level walk in C — note that the documented accessors (GetSidSubAuthority, GetSidIdentifierAuthority) return pointers into the structure, which trips up everyone the first time:

#include <windows.h>
#include <sddl.h>
#include <stdio.h>

void PrintSid(PSID pSid) {
    if (!IsValidSid(pSid)) return;

    PSID_IDENTIFIER_AUTHORITY pAuth = GetSidIdentifierAuthority(pSid);
    DWORD subCount = *GetSidSubAuthorityCount(pSid);

    printf("Authority: %u\n", (DWORD)pAuth->Value[5]); // NT authority lives in the low byte
    for (DWORD i = 0; i < subCount; i++)
        printf("  SubAuthority[%lu] = %lu\n", i, *GetSidSubAuthority(pSid, i));

    LPSTR str = NULL;
    if (ConvertSidToStringSidA(pSid, &str)) {       // -> "S-1-5-..."
        printf("String SID: %s\n", str);
        LocalFree(str);
    }
}

To go the other direction — constructing a known SID — use AllocateAndInitializeSid, which takes an authority plus up to eight sub-authorities. Building the SYSTEM SID (S-1-5-18) and comparing it with EqualSid is the idiomatic way to check “am I running as LocalSystem?”:

SID_IDENTIFIER_AUTHORITY ntAuth = SECURITY_NT_AUTHORITY; // {0,0,0,0,0,5}
PSID pSystem = NULL;

if (AllocateAndInitializeSid(&ntAuth, 1,
        SECURITY_LOCAL_SYSTEM_RID,   // 18
        0, 0, 0, 0, 0, 0, 0, &pSystem)) {
    // EqualSid(tokenSid, pSystem) -> TRUE means LocalSystem
    FreeSid(pSystem);                // never free this with LocalFree
}

3. Well-Known SIDs and Built-in Principals

Some SIDs are identical on every Windows install. Hard-coding their strings is a bug waiting to happen across locales and versions; use the documented constants where you can. Memorize the ones below anyway — you’ll read them in logs daily.

SIDPrincipal
S-1-0-0Null SID (a group with no members)
S-1-1-0Everyone
S-1-5-18Local System
S-1-5-19Local Service
S-1-5-20Network Service
S-1-5-32-544Builtin\Administrators
S-1-16-12288High mandatory integrity level

Built-in accounts also carry well-known RIDs appended to the domain or machine SID: 500 is Administrator, 501 is Guest, 512 is Domain Admins. An attacker enumerating a domain looks for RID 500 and 512 specifically — the display name can be renamed, the RID cannot. Capability SIDs the OS recognizes are cached under HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\SecurityManager\CapabilityClasses\AllCachedCapabilities.


4. SIDs at Runtime: The Access Token

When a user signs in, LSA builds an access token for the session. That token is the runtime bag of identity: the user’s SID, the SIDs of every group the user belongs to, the privileges granted, and a mandatory integrity level SID (the S-1-16-* family). Every process started in that logon context inherits a copy. When code makes an access check, the kernel compares the SIDs in the token against the SIDs in the object’s DACL.

One detail that becomes an attack surface later: an account can carry extra SIDs in its Active Directory sIDHistory attribute. That attribute exists for legitimate domain migration — copy the old SID into sIDHistory so a migrated user keeps access to resources permissioned to the old account without re-ACLing everything. The catch is that all values in sIDHistory are injected into the access token at logon, exactly as if they were primary group memberships.


Flowchart showing how LSA mints an access token at logon, the token is inherited by processes, and the Security Reference Monitor compares token SIDs against an object DACL to produce a granted access mask
Every handle open flows through SeAccessCheck, which compares the caller’s token SIDs against the target object’s DACL top-to-bottom before returning a granted-access mask.

5. The Security Descriptor: Structure and Fields

Every object the Object Manager creates has a security descriptor. The structure is SECURITY_DESCRIPTOR, reproduced here verbatim from winnt.h:

typedef struct _SECURITY_DESCRIPTOR {
  BYTE                        Revision;
  BYTE                        Sbz1;
  SECURITY_DESCRIPTOR_CONTROL Control;
  PSID                        Owner;
  PSID                        Group;
  PACL                        Sacl;
  PACL                        Dacl;
} SECURITY_DESCRIPTOR, *PISECURITY_DESCRIPTOR;

Field by field: Revision is always 1; Sbz1 is reserved and must be zero; Control is a flag bitmask; Owner and Group point to SIDs; Dacl and Sacl point to access-control lists. The internal layout differs between absolute form (the struct holds pointers to separately allocated SIDs and ACLs) and self-relative form (everything packed into one contiguous blob with offsets, marked by SE_SELF_RELATIVE). Because that format varies, never poke fields directly — drive it through the API.

The Control field qualifies how the rest of the descriptor is interpreted:

FlagMeaning
SE_DACL_PRESENTThe descriptor has a DACL (the pointer may still be NULL)
SE_SACL_PRESENTThe descriptor has a SACL
SE_DACL_PROTECTEDDACL is shielded from inherited ACEs
SE_SACL_PROTECTEDSACL is shielded from inherited ACEs
SE_OWNER_DEFAULTEDOwner was assigned by a default mechanism
SE_SELF_RELATIVEDescriptor is in packed, self-relative form

Here is the single most important gotcha in this entire topic, and it has burned production systems repeatedly. There is a difference between no DACL, an empty DACL, and a NULL DACL:

SECURITY_DESCRIPTOR sd;
InitializeSecurityDescriptor(&sd, SECURITY_DESCRIPTOR_REVISION);

// NULL DACL: present == TRUE, pointer == NULL  -> GRANTS EVERYONE FULL ACCESS
SetSecurityDescriptorDacl(&sd, TRUE, NULL, FALSE);

// Empty DACL: present == TRUE, non-NULL ACL with zero ACEs -> DENIES EVERYONE
// (initialize an ACL with InitializeAcl and add no ACEs, then pass it here)

If SE_DACL_PRESENT is not set, or it is set with a NULL DACL pointer, the object allows full access to everyone. Developers reach for SetSecurityDescriptorDacl(&sd, TRUE, NULL, FALSE) thinking “no restrictions, default behavior” and ship a world-writable named pipe or service. An empty DACL — present, non-NULL, zero ACEs — does the opposite and denies everyone. One null pointer is the difference.


Hierarchy diagram of the SECURITY_DESCRIPTOR structure showing Owner SID, Group SID, DACL containing allow and deny ACEs, and SACL containing audit ACEs as child nodes
A security descriptor owns four pointers: two SIDs declaring ownership, a DACL controlling access, and a SACL controlling auditing — each ACE carries its own SID and access mask.

6. DACLs and ACEs: How Access Is Decided

A DACL is an ordered list of Access Control Entries. Each ACE has an ACE_HEADER (AceType, AceFlags, AceSize), an ACCESS_MASK of rights, and a trailing SID the entry applies to.

ACE TypeUsed InEffect
ACCESS_ALLOWED_ACEDACLGrants rights in its mask to the SID
ACCESS_DENIED_ACEDACLDenies rights in its mask to the SID
SYSTEM_AUDIT_ACESACLLogs access matching its mask

Evaluation order matters: the kernel walks ACEs top to bottom and stops as soon as the requested access is fully granted or any of it is denied. Well-formed (canonical) DACLs place deny ACEs ahead of allow ACEs precisely so a deny is seen first. An ACL has no hard ACE-count limit, but the whole ACL must stay under 64 KB.

Reading a real object’s DACL means pulling the descriptor and iterating ACEs by index with GetAce:

PSECURITY_DESCRIPTOR pSD = NULL;
PSID  pOwner = NULL;
PACL  pDacl  = NULL;

DWORD rc = GetNamedSecurityInfoW(
    L"C:\\Windows\\System32\\config\\SAM", SE_FILE_OBJECT,
    OWNER_SECURITY_INFORMATION | DACL_SECURITY_INFORMATION,
    &pOwner, NULL, &pDacl, NULL, &pSD);

if (rc == ERROR_SUCCESS && pDacl) {
    for (WORD i = 0; i < pDacl->AceCount; i++) {
        PACE_HEADER hdr = NULL;
        if (GetAce(pDacl, i, (LPVOID*)&hdr)) {
            // hdr->AceType  == ACCESS_ALLOWED_ACE_TYPE / ACCESS_DENIED_ACE_TYPE
            // hdr->AceFlags == CONTAINER_INHERIT_ACE | OBJECT_INHERIT_ACE | ...
        }
    }
    LocalFree(pSD);
}

7. SACLs: Auditing Through the System ACL

The SACL uses the same ACL container but holds SYSTEM_AUDIT_ACE entries instead. Its access mask doesn’t grant or deny anything — it defines which access attempts generate audit records in the Windows Security Event Log. Reading or writing any object’s SACL requires the SeSecurityPrivilege right, which only Administrators normally hold. That privilege boundary is exactly why SACL tampering is a high-value detection target: the act of stripping audit ACEs is itself privileged.


8. SDDL: Security Descriptors as Text

A binary descriptor is awful to log, diff, or paste into a config file, so Windows defines the Security Descriptor Definition Language — a string form. The grammar is O: owner, G: group, D: DACL, S: SACL, each followed by flags and parenthesized ACEs:

O:BAG:SYD:(A;;FA;;;SY)(A;;FA;;;BA)(A;;0x1200a9;;;BU)S:(AU;SAFA;FA;;;WD)

That single ACE (A;;GRGWGX;;;SY) reads as: Allow, no inherit flags, Generic Read/Write/eXecute, to SY (SYSTEM). Round-trip it with ConvertSecurityDescriptorToStringSecurityDescriptor and ConvertStringSecurityDescriptorToSecurityDescriptor. In practice you’ll read SDDL far more often through PowerShell:

$acl = Get-Acl C:\Windows\System32\config\SAM
$acl.Owner            # owner principal
$acl.Sddl             # full SDDL string
$acl.Access | Format-Table IdentityReference, FileSystemRights, AccessControlType

icacls <path> gives the same data in a terser shorthand; Get-Acl is friendlier when you want the SDDL string itself for a baseline diff.


9. Inheritance and the Kernel Check

Child objects don’t usually carry hand-written ACLs. They inherit them. An ACE’s flags decide propagation: OBJECT_INHERIT_ACE (OI) pushes it onto leaf objects like files, CONTAINER_INHERIT_ACE (CI) onto sub-containers like folders or registry subkeys, and INHERIT_ONLY_ACE (IO) makes an ACE apply only to children and not the object carrying it. SE_DACL_PROTECTED blocks inheritance entirely — that’s what “disable inheritance” does in Explorer.

The decision itself happens in the kernel. Each OBJECT_HEADER carries a SecurityDescriptor field. At handle-creation time the Object Manager hands the token, the requested access, and the descriptor to the Security Reference Monitor (nt!SeAccessCheck), which walks the DACL and returns a granted-access mask. You can see the whole chain live in WinDbg:

kd> !process 0 0 lsass.exe
kd> !object <Object address>
kd> dt nt!_OBJECT_HEADER <header address> SecurityDescriptor
kd> !sd <SecurityDescriptor address & ~0xf>   ; mask low bits, they're flags
kd> !token                                     ; the token the check runs against

Files, registry keys, processes, threads, named pipes, services, jobs — anything named and securable runs through this same path.


10. Common Attacker Techniques

SIDs and SDs aren’t just plumbing — they’re a manipulation target for evasion and escalation. The primitives below all leave traces (covered next), which is the point of teaching them.

TechniqueDescription
NULL DACL plantingSet a present-but-NULL DACL on a service, registry key, or pipe to make it world-writable
DACL tampering for persistenceAdd an explicit ACCESS_ALLOWED_ACE granting the attacker’s SID FullControl on a sensitive object
Owner abuseTaking ownership of an object implicitly grants WRITE_DAC, letting an attacker rewrite the DACL afterward
SID-History injectionWrite a privileged SID (e.g. a Domain Admins RID) into a controlled account’s sIDHistory so it lands in the token
SACL strippingRemove audit ACEs from lsass.exe, SAM, or ntds.dit to suppress access logging before credential theft
Permission group discoveryEnumerate group SIDs and ACL members to plan lateral movement

A populated sIDHistory on a non-migrated account is the canonical hunting signal for the injection case:

Get-ADUser -Filter * -Properties sIDHistory |
    Where-Object { $_.sIDHistory } |
    Select-Object Name, @{ n='sIDHistory'; e={ $_.sIDHistory -join ', ' } }

In a domain with no active migration, any result here deserves investigation — especially a sIDHistory value ending in RID 512 or 519.


Graph diagram mapping four attacker techniques — SID-History Injection, NULL DACL Planting, DACL Tampering, and SACL Stripping — to their respective impacts: privileged token, world-writable object, persistent access, and audit blindspot
Each abuse primitive targets a distinct part of the SID/security-descriptor model and produces a different attacker capability, from silent credential theft to persistent object access.

11. Detection, Hunting, and Hardening

DACL and SACL changes are logged by Windows itself, not Sysmon — you must enable the right Advanced Audit Policy subcategories first (Object Access → Audit File System / Audit Registry, and Policy Change → Audit Audit Policy Change).

Event IDTriggerHunt On
4670Object permissions changed (DACL/Owner)ObjectName, OldSd, NewSd, SubjectUserSid
4907Object auditing (SACL) settings changedBlank NewSd = SACL stripped
4715Audit policy on an object changedOriginalSecurityDescriptor, NewSecurityDescriptor
4719System audit policy changedSubjectUserSid, AuditPolicyChanges
4663Object access attemptSudden gaps after a 4907 on LSASS = stripping
4728/4732/4756Member added to privileged groupCorrelate with SID manipulation

The highest-fidelity signal is a 4907 that blanks the SACL on lsass.exe, ntds.dit, or the SAM hive — that’s pre-credential-dump preparation. Pair it with Sysmon Event ID 10 (process access to LSASS) and Event ID 1 watching for icacls.exe, cacls.exe, sc.exe sdset, and Set-Acl command lines. A Sigma sketch for DACL tampering on sensitive objects:

title: Suspicious DACL Modification on Sensitive Object
logsource:
  product: windows
  service: security
detection:
  selection:
    EventID: 4670
    ObjectName|contains:
      - '\lsass.exe'
      - '\ntds.dit'
      - '\SAM'
  condition: selection
fields:
  - SubjectUserSid
  - ObjectName
  - OldSd
  - NewSd
level: high

Hardening, in rough priority order:

  • Hunt NULL DACLs. Use AccessChk to enumerate world-writable services, keys, and files; fix them.
  • Protect the LSASS SACL and alert on any 4907 that empties it.
  • Enable SID Filtering on every trust to neutralize cross-domain sIDHistory abuse, and audit sIDHistory on a schedule.
  • Restrict SeSecurityPrivilege to Administrators and watch for its use.
  • Prefer explicit DENY over absent ALLOW, and put privileged accounts in Protected Users.

MITRE ATT&CK Mapping

TechniqueMITRE IDDetection
Access Token ManipulationT1134Token/SID anomalies in logon events
SID-History InjectionT1134.005Non-empty sIDHistory on non-migrated accounts
File/Directory Permissions ModificationT1222.0014670; icacls/SetNamedSecurityInfo in 4688
Impair Defenses: Disable/Modify ToolsT1562.0014907 blanking a SACL; 4663 gaps
Permission Groups DiscoveryT1069.001 / .002Bulk SID/group enumeration

12. Tools

ToolDescriptionLink
AccessChkDumps effective permissions and finds NULL/weak DACLslearn.microsoft.com
icaclsBuilt-in ACL viewer/editor with SDDL shorthand(built-in)
Get-Acl / Set-AclPowerShell SD read/write, exposes .Sddl(built-in)
WinDbgKernel-side !sd, !token, OBJECT_HEADER inspectionlearn.microsoft.com
Process HackerGUI view of token SIDs and object securityprocesshacker.sourceforge.io
WinObjBrowse Object Manager namespace and per-object securitylearn.microsoft.com

Summary

  • A SID is the immutable, never-reused name Windows checks for every authorization decision — display names are cosmetic, SIDs are ground truth.
  • The access token carries the user SID plus all group SIDs (including any from sIDHistory), and the kernel compares those against an object’s DACL via nt!SeAccessCheck.
  • The SECURITY_DESCRIPTOR binds owner, group, DACL, and SACL; a present-but-NULL DACL silently grants everyone full access, while an empty DACL denies everyone.
  • SID-History injection (T1134.005) and SACL stripping (T1562.001) are the two abuse primitives worth hunting hardest — watch 4670, 4907, and non-empty sIDHistory.
  • Enable Object Access and Policy Change auditing, restrict SeSecurityPrivilege, enable SID Filtering on trusts, and baseline SDDL on sensitive objects so a tampered DACL stands out.

Related Tutorials

References

Fibers: User-Mode Cooperative Threads

Objective: Understand the internals of Windows fibers — how they relate to the TEB, the undocumented FIBER structure, Fiber Local Storage, and the cooperative context switch performed entirely in user mode — so defenders can recognize and detect adversarial use of fiber APIs for stealthy in-process execution.


1. Cooperative vs. Preemptive Scheduling

A thread is the Windows kernel’s unit of execution. The scheduler picks ready threads, slices CPU time, and preempts them at quantum boundaries — all driven from ntoskrnl.exe. A fiber is different: it is a unit of execution that the kernel does not know about. Fibers run inside threads, and the application — not the OS — chooses when one fiber yields and another runs.

Two consequences follow immediately:

  • A fiber switch never crosses the user/kernel boundary. No syscall is issued. SwitchToFiber lives in KernelBase.dll and returns without touching ntoskrnl.
  • From the kernel’s perspective, all activity performed by a fiber is attributed to the thread that runs it. Accessing TLS from a fiber accesses the thread’s TLS, not a per-fiber slot.

This is the root of both the elegance and the security relevance of fibers: they are coroutines built directly into the Win32 ABI, with stack pivots and register saves the kernel cannot see.


2. The Fiber Execution Model

A fiber consists of three things: a stack, a saved CPU context (registers, instruction pointer, SEH frame), and a start routine that receives an opaque parameter. A thread becomes “fiber-aware” by calling ConvertThreadToFiber, at which point that thread is permanently a fiber host until it calls ConvertFiberToThread.

RuleBehavior
Must convert firstYou cannot call SwitchToFiber from a thread until ConvertThreadToFiber runs.
Fiber function returningIf a fiber’s start routine returns, the host thread calls ExitThread and terminates.
Self-deleteIf the currently running fiber calls DeleteFiber on itself, the host thread exits.
Cross-thread deleteDeleting a fiber that is the selected fiber of another thread will likely crash that thread — its stack just disappeared.
Cross-thread switchSwitchToFiber accepts a fiber created by a different thread; the caller becomes the new host.

These rules are load-bearing — most fiber bugs (and several known abuse primitives) come from violating them.


3. TEB Layout and the FIBER Structure

The Thread Environment Block (TEB) tracks the per-thread fiber state. Three fields matter:

FieldTypeRole
NtTib.FiberDataPVOIDPointer to the current fiber’s FIBER structure
HasFiberDataUSHORT : 1Bitfield set by ConvertThreadToFiberEx; indicates the thread hosts fibers
FlsDataPVOIDPointer to the FLS slot array for the current fiber

ConvertThreadToFiberEx calls NtCurrentTeb(), checks Teb->HasFiberData, and if the thread is already a fiber returns with ERROR_ALREADY_FIBER. Otherwise it allocates a FIBER structure on the process heap via RtlAllocateHeap and stores its address in NtTib.FiberData.

The FIBER struct itself is not officially documented. The shape below is reconstructed from ReactOS sources and public symbols and is subject to change across Windows versions:

// Reconstructed from public symbols / ReactOS — illustrative only.
typedef struct _FIBER {
    PVOID    FiberData;          // lpParameter passed at creation
    PVOID    ExceptionList;      // Top of SEH chain (NT_TIB.ExceptionList)
    PVOID    StackBase;          // High end of the fiber stack
    PVOID    StackLimit;         // Low end (guard page)
    PVOID    DeallocationStack;  // Original VirtualAlloc base
    CONTEXT  FiberContext;       // Saved CPU state: RIP, RSP, RBP, RBX, ...
    ULONG    FiberFlags;         // FIBER_FLAG_FLOAT_SWITCH, etc.
    PVOID    ActivationContext;  // Per-fiber activation context stack
    PVOID    FlsSlots;           // Per-fiber FLS slot array
} FIBER, *PFIBER;

You must never read or write this structure directly. The Win32 fiber functions manage its contents; treating the returned LPVOID as opaque is part of the contract.


4. The Core Fiber API

The full surface is small. Most of winbase.h and fibersapi.h boils down to these functions:

FunctionPurpose
ConvertThreadToFiberPromote the calling thread into a fiber; required first
ConvertThreadToFiberExAs above; accepts FIBER_FLAG_FLOAT_SWITCH
CreateFiberAllocate stack + FIBER struct; record entry point and parameter
CreateFiberExAs above; accepts dwStackCommitSize and flags
SwitchToFiberCooperative context switch to the supplied fiber
DeleteFiberFree the fiber’s stack, context, and FIBER data
ConvertFiberToThreadDemote back to a plain thread; required to avoid leaks
GetCurrentFiberReturns the current FIBER address (intrinsic — no CALL)
GetFiberDataReturns the lpParameter value (intrinsic — no CALL)

The exact CreateFiber signature, per MSDN:

LPVOID CreateFiber(
    SIZE_T                dwStackSize,    // 0 = default, grows up to 1 MB
    LPFIBER_START_ROUTINE lpStartAddress, // void StartRoutine(LPVOID lpParameter)
    LPVOID                lpParameter     // passed to the fiber function
);

GetCurrentFiber and GetFiberData are compiler intrinsics on MSVC — they inline directly to a gs:[0x20]/fs:[0x10] read of NtTib.FiberData. They produce no import thunk and no CALL instruction, which has direct consequences for IAT-based detection.


5. Fiber Lifecycle: A Minimal Example

This walks the canonical create → switch → yield → delete sequence. Note how g_mainFiber is the fiber identity of the original thread, returned by ConvertThreadToFiber.

#include <windows.h>
#include <stdio.h>

LPVOID g_mainFiber  = NULL;
LPVOID g_workFiber  = NULL;

VOID CALLBACK WorkerFiberProc(LPVOID lpParam) {
    printf("[worker] running on fiber %p, param=%p\n",
           GetCurrentFiber(), lpParam);

    // Cooperative yield — control returns to the main fiber.
    SwitchToFiber(g_mainFiber);

    printf("[worker] resumed; returning will ExitThread()\n");
    SwitchToFiber(g_mainFiber);   // never let the routine return
}

int main(void) {
    // Promote thread; TEB->HasFiberData becomes 1.
    g_mainFiber = ConvertThreadToFiber(NULL);

    // 64 KiB stack; entry = WorkerFiberProc; param = 0xDEADBEEF.
    g_workFiber = CreateFiber(0x10000, WorkerFiberProc, (LPVOID)0xDEADBEEF);

    SwitchToFiber(g_workFiber);   // first run of worker
    printf("[main] back from worker\n");
    SwitchToFiber(g_workFiber);   // resume worker

    DeleteFiber(g_workFiber);     // safe: not the running fiber
    ConvertFiberToThread();       // demote; release fiber bookkeeping
    return 0;
}

Forgetting ConvertFiberToThread leaks the main fiber’s FIBER allocation on the process heap. Forgetting to yield back before the worker returns terminates the host thread via ExitThread.


6. Context Switching Internals

SwitchToFiber is the heart of the API. Conceptually, it performs:

  1. Save the current CPU state (RBX, RBP, RDI, RSI, R12R15, RSP, RIP on x64) into the current fiber’s FiberContext.
  2. Save the SEH chain head (NtTib.ExceptionList) and stack bounds (StackBase, StackLimit) into the current FIBER.
  3. If FIBER_FLAG_FLOAT_SWITCH is set, save the XMM/MMX/x87 state.
  4. Update NtTib.FiberData to point at the target FIBER.
  5. Restore the target fiber’s stack bounds, SEH chain, FLS pointer, and CPU registers.
  6. Return to the saved instruction pointer of the target — execution resumes there on the target’s stack.

Critically, this is a pure user-mode operation. No syscall, no int 2e, no ETW event from Microsoft-Windows-Kernel-Process. The host thread’s kernel-visible state (KTHREAD, ETHREAD) is unchanged; only RIP/RSP move from the kernel’s view.

; Conceptual sketch — SwitchToFiber x64 prologue
mov     gs:[0x20], rcx          ; NtTib.FiberData = target
mov     [rax + FiberContextOff + Rsp], rsp
mov     [rax + FiberContextOff + Rip], <return addr>
; ... restore target ...
mov     rsp, [rcx + FiberContextOff + Rsp]
jmp     qword [rcx + FiberContextOff + Rip]

Flow diagram showing the six steps of SwitchToFiber: saving registers, saving SEH and stack bounds, updating NtTib.FiberData, restoring target registers, and jumping to the target fiber's saved RIP — all in user mode with no syscall
SwitchToFiber completes an entire stack-and-register swap inside KernelBase.dll without issuing a single syscall or generating a kernel ETW event.

7. Fiber Local Storage (FLS)

TLS is per-thread. During a fiber switch the TEB’s TLS array is not swapped, so two fibers sharing a thread share TLS — a classic source of corruption when porting thread-based libraries to fibers. FLS solves this: it is per-fiber, and SwitchToFiber updates TEB->FlsData to the incoming fiber’s slot array.

FunctionPurpose
FlsAlloc(PFLS_CALLBACK_FUNCTION)Allocate an FLS index; optional destructor callback
FlsSetValue(DWORD, PVOID)Store a per-fiber value at the given index
FlsGetValue(DWORD)Read the current fiber’s value at the given index
FlsFree(DWORD)Release the index; callbacks fire for live fibers

The destructor callback pointers are kept process-wide in PEB->FlsCallback. They fire on fiber deletion and thread exit, and — as covered below — they are a known abuse target.

DWORD g_flsIndex;

VOID WINAPI OnFlsDestroy(PVOID p) {
    HeapFree(GetProcessHeap(), 0, p);
}

VOID CALLBACK FiberA(LPVOID _) {
    char *buf = (char*)HeapAlloc(GetProcessHeap(), 0, 32);
    lstrcpyA(buf, "fiber-A-private");
    FlsSetValue(g_flsIndex, buf);
    SwitchToFiber(g_mainFiber);
    printf("[A] still mine: %s\n", (char*)FlsGetValue(g_flsIndex));
    SwitchToFiber(g_mainFiber);
}

int wmain(void) {
    g_mainFiber = ConvertThreadToFiber(NULL);
    g_flsIndex  = FlsAlloc(OnFlsDestroy);
    // ... create FiberA, FiberB, switch between them ...
    // Each fiber sees its own FlsGetValue(g_flsIndex) result.
}

Hierarchy diagram showing how PEB holds FlsCallback destructor pointers, TEB holds NtTib.FiberData pointing to the FIBER structure and FlsData pointing to the per-fiber FLS slot array, with the destructor relationship between PEB FlsCallback and the slot array
FLS slot arrays are swapped per-fiber on every SwitchToFiber call, while PEB→FlsCallback holds process-wide destructor pointers that fire on fiber deletion — a known adversarial overwrite target.

8. Building a Round-Robin Cooperative Scheduler

Fibers shine when modeling cooperative pipelines: parsers, generators, state machines. A trivial scheduler is a dispatcher fiber that round-robins through worker fibers, each of which yields back via SwitchToFiber(g_mainFiber).

#define N 3
LPVOID g_workers[N];
LPVOID g_mainFiber;

VOID CALLBACK Worker(LPVOID id) {
    for (int i = 0; i < 4; ++i) {
        printf("[worker %llu] step %d\n", (ULONG_PTR)id, i);
        SwitchToFiber(g_mainFiber);   // yield
    }
    // Final yield — never return from a fiber routine.
    SwitchToFiber(g_mainFiber);
}

int main(void) {
    g_mainFiber = ConvertThreadToFiber(NULL);
    for (ULONG_PTR i = 0; i < N; ++i)
        g_workers[i] = CreateFiber(0, Worker, (LPVOID)i);

    for (int round = 0; round < 4; ++round)
        for (int i = 0; i < N; ++i)
            SwitchToFiber(g_workers[i]);

    for (int i = 0; i < N; ++i) DeleteFiber(g_workers[i]);
    ConvertFiberToThread();
    return 0;
}

This is the same pattern Microsoft SQL Server used for its historical “lightweight pooling” / fiber mode — one OS thread, many SQL user contexts.


9. Legitimate Use Cases and Pitfalls

Use CaseReason
Coroutines / generatorsNative stack switching with no setjmp tricks
Porting cooperative legacy codeUNIX swapcontext-style schedulers map cleanly
Database enginesSQL Server fiber mode for high-concurrency workloads
Game engines / scripting hostsPer-script execution context with explicit yield

Pitfalls are sharp:

  • COM is apartment-affinitive to threads, not fibers. Initializing COM on one fiber and using it from another corrupts COM bookkeeping.
  • CRT and many MS libraries stash state in TLS. Switching fibers leaves that state behind, producing subtle corruption.
  • Critical sections record the thread as the owner — a different fiber on the same thread re-enters without blocking.
  • Stack-cookies and __try/__except rely on SEH chain integrity; SwitchToFiber handles this, but raw RtlInstallFunctionTableCallback on a fiber stack must use the fiber’s StackBase/StackLimit.

10. Common Attacker Techniques

Fibers are attractive to adversaries because the entire execution primitive lives in user mode — no NtCreateThread, no CreateRemoteThread, no kernel ETW event for the act of switching execution. The patterns below are documented in public threat-research literature; described conceptually here for detection engineers.

TechniqueDescription
In-process shellcode via SwitchToFiberAllocate PAGE_EXECUTE_READWRITE memory, copy a payload, call ConvertThreadToFiber then CreateFiber with the payload as lpStartAddress, then SwitchToFiber — execution begins with no new thread
Fiber-based ROP stagingA fiber’s saved CONTEXT includes RIP and RSP; manipulating a FIBER struct’s context fields lets an attacker pivot the stack on SwitchToFiber
PEB->FlsCallback overwriteOverwrite an entry in the process-wide FLS callback array; on the next FlsFree or fiber/thread teardown the attacker-controlled pointer is invoked with attacker-controlled data
TLS evasion via FLSHide per-task state in FLS slots that defensive tooling enumerating TLS will miss
API hiding via intrinsicsGetCurrentFiber/GetFiberData produce no IAT entry; static analysis missing gs:[0x20] reads will not see fiber-aware code

The base ATT&CK parent for fiber-based in-process execution is T1055 Process Injection; MITRE has not assigned a fiber-specific sub-technique, so the closest analogue is T1055.004 (APC) which shares the “queue execution to a thread’s user-mode context” model.


11. Defensive Strategies & Detection

There is no kernel event for SwitchToFiber. Detection must focus on the setup that precedes fiber-based execution (RWX allocation, suspicious entry points) and on memory forensics of fiber state at rest.

Sysmon coverage for the surrounding behavior:

Event IDSignal
1Process Create — establish baseline lineage
8CreateRemoteThread — co-occurs with cross-process fiber staging
10ProcessAccess — reflective loaders reading remote memory before fiber dispatch
17/18Named-pipe create/connect — common multi-stage loader IPC
25ProcessTampering — image-region tampering in a fiber host

ETW providers worth subscribing:

  • Microsoft-Windows-Threat-Intelligence — flags VirtualAlloc/VirtualProtect with PAGE_EXECUTE_*, the precursor to fiber shellcode staging.
  • Microsoft-Windows-Kernel-Process — does not see fiber switches but covers process/thread lifecycle.
  • A user-mode consumer hooking NtAllocateVirtualMemory + NtProtectVirtualMemory gives the strongest pre-execution signal.

Memory forensics indicators:

  • Walk TEB.NtTib.FiberData on every thread. Threads with HasFiberData == 1 in processes that have no business using fibers are immediately interesting.
  • Use Volatility malfind to surface private, executable, non-image-backed pages — the target of a fiber-staged payload.
  • Dump PEB->FlsCallback and verify every entry resolves to an expected module’s .text section.

Sigma sketch for the cross-process precursor to fiber-based payload staging:

title: Suspicious ProcessAccess Preceding User-Mode Fiber Execution
id: 8f5c1d6e-3c7b-4b1f-9e1e-7e3e6e2b0a1f
logsource:
  product: windows
  service: sysmon
detection:
  selection:
    EventID: 10
    GrantedAccess:
      - '0x1fffff'   # PROCESS_ALL_ACCESS
      - '0x1f0fff'
    TargetImage|endswith:
      - '\explorer.exe'
      - '\svchost.exe'
  filter_legit:
    SourceImage|endswith:
      - '\MsMpEng.exe'
      - '\SenseIR.exe'
  condition: selection and not filter_legit
level: high
tags:
  - attack.t1055
  - attack.t1106

Hardening:

  • SetProcessMitigationPolicy with ProcessDynamicCodePolicy (Arbitrary Code Guard) blocks creation of new executable pages, defeating fiber shellcode staging.
  • Control Flow Guard restricts indirect-call targets, narrowing SwitchToFiber and FLS-callback abuse to valid entry points.
  • HVCI / memory integrity prevents kernel-side tampering of FIBER structures via vulnerable drivers.
  • WDAC / AppLocker policies that deny PAGE_EXECUTE_* allocations on non-JIT processes raise the cost of any in-process execution primitive.

Graph diagram mapping fiber abuse detection signals: RWX allocation feeding ETW Threat-Intelligence provider and Sysmon events, memory forensics walking PEB FlsCallback for non-text-section pointers, and ACG/CFG/HVCI as hardening mitigations
Because SwitchToFiber produces no kernel telemetry, defenders must pivot to pre-execution signals like RWX allocations, memory forensics on FiberData and FlsCallback, and ACG to deny executable page creation entirely.

12. Tools for Fiber Analysis

ToolDescriptionLink
WinDbgDump TEB, walk NtTib.FiberData, inspect FIBER.FiberContextmicrosoft.com
Process HackerEnumerate threads, inspect TEB, examine private RWX regionsprocesshacker.sf.io
Process MonitorCapture VirtualAlloc/VirtualProtect sequences preceding fiber dispatchsysinternals.com
Volatility 3windows.malfind, TEB plugins, FLS callback inspectionvolatilityfoundation.org
pykd / WinDbg JSScripted walks of FIBER chains across all threadsgithomelab.ru/pykd
x64dbgUser-mode debugging of fiber-aware binaries; trace gs:[0x20] readsx64dbg.com
GhidraStatic analysis; recognize GetCurrentFiber intrinsic patternghidra-sre.org
SysmonSurrounding telemetry (Events 1, 8, 10, 25)sysinternals.com

A minimal WinDbg recipe to surface fiber-hosting threads in a captured process:

0:000> !teb
TEB at 000000abcd123000
    ...
    NtTib.FiberData:  0000020fabcde000
    ...
0:000> dt ntdll!_TEB @$teb HasFiberData
0:000> dq 0000020fabcde000 L40   ; raw FIBER bytes — layout version-dependent

13. MITRE ATT&CK Mapping

TechniqueMITRE IDDetection
Process InjectionT1055Memory scan for private RWX regions; ETW TI on NtAllocateVirtualMemory
Process Injection: Asynchronous Procedure CallT1055.004Closest published sub-technique to fiber-based in-process execution
Native APIT1106API-call auditing of CreateFiber/SwitchToFiber/FlsAlloc
Reflective Code LoadingT1620Image-load anomalies; fiber entry point in non-image-backed memory
Impair Defenses: Disable or Modify ToolsT1562.001ETW/AMSI hook integrity checks; user-mode hook auditing

MITRE ATT&CK does not currently list a “Fiber Injection” sub-technique (current as of v16.1). Vendor research treats fiber-based execution as a variant of T1055; map accordingly.


Summary

  • A fiber is a user-mode cooperative thread invisible to the kernel scheduler — SwitchToFiber performs a stack and register swap entirely in KernelBase.dll with no syscall.
  • The TEB exposes the fiber state via NtTib.FiberData, HasFiberData, and FlsData; the FIBER structure itself is undocumented and version-dependent.
  • TLS is per-thread and is not swapped on a fiber switch; FLS is per-fiber and is swapped, with destructor callbacks tracked in PEB->FlsCallback.
  • Adversaries abuse fibers for in-process shellcode execution, ROP staging via the saved CONTEXT, and code execution via PEB->FlsCallback overwrites — none of which trigger thread-creation telemetry.
  • Detect via pre-execution signals (ETW TI on RWX allocations, Sysmon Event IDs 8/10/25), memory forensics on private executable regions and FlsCallback integrity, and hardening with ACG, CFG, and HVCI.

Related Tutorials

References

Jobs and Silos: Process Grouping and Resource Limits

Objective: Understand how the Windows kernel uses Job Objects and Silo Objects to group processes, enforce CPU/memory/network limits, and provide the namespace isolation that underpins Windows containers — and how defenders detect and harden against their abuse.


1. What Is a Job Object?

A job object lets a group of processes be managed as a single unit. It is a namable, securable, sharable kernel object that controls attributes of every process associated with it; operations on the job — limits, termination, accounting — apply to all member processes at once.

In the kernel the object is the undocumented executive type EJOB, allocated from kernel pool. Each process control block carries an EPROCESS.Job pointer linking it to its owning job. User mode never touches EJOB directly; it operates through a handle returned by CreateJobObject.

Before Windows 8 / Windows Server 2012, a process could belong to one job and jobs could not be nested. Windows 8 introduced nested jobs, allowing a process to participate in a hierarchy where the effective limit is the most restrictive ancestor.

Object TypeDescription
EJOBKernel job object; groups processes, holds limits and accounting
EPROCESS.JobPer-process pointer to its owning job
Named jobJob published under \Sessions\<N>\BaseNamedObjects\, openable by name
Anonymous jobHandle-only job, no namespace entry, shared by duplication/inheritance

Hierarchy diagram showing a user-mode handle referencing the kernel EJOB object, which links to three EPROCESS member processes via Job pointers
A single EJOB kernel object anchors all member processes; user mode accesses it only through an opaque handle.

2. Core Job Object APIs

The job lifecycle is driven by a small, stable Win32 surface.

FunctionPurpose
CreateJobObjectCreate, or open if named, a job object
OpenJobObjectOpen an existing named job
AssignProcessToJobObjectAdd a process to a job
SetInformationJobObjectApply limits and policy to the job
QueryInformationJobObjectRead limits, accounting, and peak usage
TerminateJobObjectKill every process in the job
IsProcessInJobTest whether a process already belongs to a job
HANDLE CreateJobObject(LPSECURITY_ATTRIBUTES lpJobAttributes, LPCWSTR lpName);
BOOL   AssignProcessToJobObject(HANDLE hJob, HANDLE hProcess);
BOOL   SetInformationJobObject(HANDLE hJob, JOBOBJECTINFOCLASS JobObjectInformationClass,
                               LPVOID lpJobObjectInformation, DWORD cbJobObjectInformationLength);
BOOL   QueryInformationJobObject(HANDLE hJob, JOBOBJECTINFOCLASS JobObjectInformationClass,
                                 LPVOID lpJobObjectInformation, DWORD cbJobObjectInformationLength,
                                 LPDWORD lpReturnLength);
BOOL   TerminateJobObject(HANDLE hJob, UINT uExitCode);

3. Basic Limits: CPU, Memory, and Process Count

JOBOBJECT_BASIC_LIMIT_INFORMATION carries the foundational controls.

typedef struct _JOBOBJECT_BASIC_LIMIT_INFORMATION {
  LARGE_INTEGER PerProcessUserTimeLimit;
  LARGE_INTEGER PerJobUserTimeLimit;
  DWORD         LimitFlags;
  SIZE_T        MinimumWorkingSetSize;
  SIZE_T        MaximumWorkingSetSize;
  DWORD         ActiveProcessLimit;
  ULONG_PTR     Affinity;
  DWORD         PriorityClass;
  DWORD         SchedulingClass;
} JOBOBJECT_BASIC_LIMIT_INFORMATION;

The LimitFlags bitmask selects which fields the kernel enforces.

Limit FlagDescription
JOB_OBJECT_LIMIT_PROCESS_TIMEPer-process user-mode CPU cap (100 ns ticks); process killed when exceeded
JOB_OBJECT_LIMIT_JOB_TIMEJob-wide CPU time cap
JOB_OBJECT_LIMIT_WORKINGSETMin/max working set per process
JOB_OBJECT_LIMIT_ACTIVE_PROCESSCaps active process count; over-limit assignment terminates the process
JOB_OBJECT_LIMIT_AFFINITYForces a processor affinity mask
JOB_OBJECT_LIMIT_KILL_ON_JOB_CLOSEKills all processes when the last job handle closes

JOB_OBJECT_LIMIT_KILL_ON_JOB_CLOSE is the cornerstone of any sandbox: if the controlling process dies, the entire tree is reaped, leaving no orphaned children.

#include <windows.h>

int main(void) {
    HANDLE hJob = CreateJobObject(NULL, L"Sandbox_Demo");   // named for observability
    if (!hJob) return GetLastError();

    JOBOBJECT_EXTENDED_LIMIT_INFORMATION eli = { 0 };
    eli.BasicLimitInformation.LimitFlags =
        JOB_OBJECT_LIMIT_KILL_ON_JOB_CLOSE |   // tear down tree on handle loss
        JOB_OBJECT_LIMIT_ACTIVE_PROCESS;       // bound process count
    eli.BasicLimitInformation.ActiveProcessLimit = 4;
    SetInformationJobObject(hJob, JobObjectExtendedLimitInformation, &eli, sizeof(eli));

    STARTUPINFO si = { sizeof(si) };
    PROCESS_INFORMATION pi = { 0 };
    // Create suspended so we can assign before any code runs
    CreateProcess(L"C:\\Windows\\System32\\notepad.exe", NULL, NULL, NULL,
                  FALSE, CREATE_SUSPENDED, NULL, NULL, &si, &pi);

    AssignProcessToJobObject(hJob, pi.hProcess);
    ResumeThread(pi.hThread);

    CloseHandle(pi.hThread);
    CloseHandle(pi.hProcess);
    CloseHandle(hJob);   // KILL_ON_JOB_CLOSE terminates notepad here
    return 0;
}

4. Extended and Rate Limits

JOBOBJECT_EXTENDED_LIMIT_INFORMATION embeds the basic structure as BasicLimitInformation and adds memory governance: ProcessMemoryLimit (per-process commit, needs JOB_OBJECT_LIMIT_PROCESS_MEMORY), JobMemoryLimit (job-wide commit, needs JOB_OBJECT_LIMIT_JOB_MEMORY), and the continuously tracked PeakProcessMemoryUsed / PeakJobMemoryUsed. The two memory limits are independent — a 100 MB job-wide cap can coexist with a 10 MB per-process cap.

JOBOBJECT_EXTENDED_LIMIT_INFORMATION eli = { 0 };
eli.BasicLimitInformation.LimitFlags =
    JOB_OBJECT_LIMIT_PROCESS_MEMORY | JOB_OBJECT_LIMIT_JOB_MEMORY;
eli.ProcessMemoryLimit = 10  * 1024 * 1024;   // 10 MB per process
eli.JobMemoryLimit     = 100 * 1024 * 1024;   // 100 MB job-wide (independent)
SetInformationJobObject(hJob, JobObjectExtendedLimitInformation, &eli, sizeof(eli));

DWORD ret = 0;
QueryInformationJobObject(hJob, JobObjectExtendedLimitInformation, &eli, sizeof(eli), &ret);
printf("PeakJobMemoryUsed: %zu bytes\n", eli.PeakJobMemoryUsed);

CPU throttling uses JOBOBJECT_CPU_RATE_CONTROL_INFORMATION.

typedef struct _JOBOBJECT_CPU_RATE_CONTROL_INFORMATION {
  DWORD ControlFlags;
  union {
    DWORD CpuRate;
    DWORD Weight;
    struct { WORD MinRate; WORD MaxRate; } DUMMYSTRUCTNAME;
  } DUMMYUNIONNAME;
} JOBOBJECT_CPU_RATE_CONTROL_INFORMATION;
Control FlagValueBehaviour
JOB_OBJECT_CPU_RATE_CONTROL_ENABLE0x1Enables CPU rate control
JOB_OBJECT_CPU_RATE_CONTROL_WEIGHT_BASED0x2Rate derived from relative weight vs. other jobs
JOB_OBJECT_CPU_RATE_CONTROL_HARD_CAP0x4Hard cap; no job threads run after the budget is spent until next interval
JOB_OBJECT_CPU_RATE_CONTROL_NOTIFY0x8Notifies when the rate limit is exceeded
JOBOBJECT_CPU_RATE_CONTROL_INFORMATION cpu = { 0 };
cpu.ControlFlags = JOB_OBJECT_CPU_RATE_CONTROL_ENABLE |
                   JOB_OBJECT_CPU_RATE_CONTROL_HARD_CAP;
cpu.CpuRate = 2000;   // 20.00% of one CPU (units of 1/100 percent)

// Windows containers (non-Hyper-V) use weight-based control instead:
// cpu.ControlFlags = JOB_OBJECT_CPU_RATE_CONTROL_ENABLE |
//                    JOB_OBJECT_CPU_RATE_CONTROL_WEIGHT_BASED;
// cpu.Weight = 5;    // relative scheduling weight

SetInformationJobObject(hJob, JobObjectCpuRateControlInformation, &cpu, sizeof(cpu));

Network bandwidth is bounded with JOBOBJECT_NET_RATE_CONTROL_INFORMATION, which sets MaxBandwidth (outgoing bytes), a DscpTag, and ControlFlags for scheduling policy.


5. Notification Limits and I/O Completion Ports

Not every limit should kill. JOBOBJECT_NOTIFICATION_LIMIT_INFORMATION defines soft limits that alert without termination, covering IoReadBytesLimit, IoWriteBytesLimit, per-job user time, and job memory. To receive these alerts, associate an I/O completion port via JOBOBJECT_ASSOCIATE_COMPLETION_PORT.

Completion MessageMeaning
JOB_OBJECT_MSG_NEW_PROCESSA process was added to the job
JOB_OBJECT_MSG_EXIT_PROCESSA member process exited
JOB_OBJECT_MSG_ACTIVE_PROCESS_ZEROJob is now empty
JOB_OBJECT_MSG_JOB_MEMORY_LIMITJob-wide commit limit was hit
HANDLE hPort = CreateIoCompletionPort(INVALID_HANDLE_VALUE, NULL, 0, 1);

JOBOBJECT_ASSOCIATE_COMPLETION_PORT acp = { 0 };
acp.CompletionKey  = hJob;     // echoed back as the key
acp.CompletionPort = hPort;
SetInformationJobObject(hJob, JobObjectAssociateCompletionPortInformation, &acp, sizeof(acp));

DWORD msg; ULONG_PTR key; LPOVERLAPPED ov;
while (GetQueuedCompletionStatus(hPort, &msg, &key, &ov, INFINITE)) {
    switch (msg) {
        case JOB_OBJECT_MSG_NEW_PROCESS:         /* child started   */ break;
        case JOB_OBJECT_MSG_JOB_MEMORY_LIMIT:    /* commit cap hit   */ break;
        case JOB_OBJECT_MSG_ACTIVE_PROCESS_ZERO: return 0;  // job empty
    }
}

6. Nested Jobs

On Windows 8 and later, assigning an already-jobbed process to a second job nests it. The kernel computes the effective limit as the minimum of the chain — a child job can only tighten, never loosen, an ancestor’s constraint.

// Parent job: 200 MB job-wide commit
HANDLE hParent = CreateJobObject(NULL, NULL);
JOBOBJECT_EXTENDED_LIMIT_INFORMATION p = { 0 };
p.BasicLimitInformation.LimitFlags = JOB_OBJECT_LIMIT_JOB_MEMORY;
p.JobMemoryLimit = 200 * 1024 * 1024;
SetInformationJobObject(hParent, JobObjectExtendedLimitInformation, &p, sizeof(p));
AssignProcessToJobObject(hParent, hProc);

// Child job nested under parent: 100 MB
HANDLE hChild = CreateJobObject(NULL, NULL);
JOBOBJECT_EXTENDED_LIMIT_INFORMATION c = { 0 };
c.BasicLimitInformation.LimitFlags = JOB_OBJECT_LIMIT_JOB_MEMORY;
c.JobMemoryLimit = 100 * 1024 * 1024;
SetInformationJobObject(hChild, JobObjectExtendedLimitInformation, &c, sizeof(c));
AssignProcessToJobObject(hChild, hProc);   // Win8+ nests automatically

// Effective limit on hProc = min(200 MB, 100 MB) = 100 MB

For pre-Windows 8 compatibility, test membership first — assigning a jobbed process there is fatal.

BOOL inJob = FALSE;
IsProcessInJob(hProc, NULL, &inJob);   // NULL JobHandle = "any job"
if (inJob) {
    // Windows 7: cannot reassign (no nesting). Windows 8+: assignment nests.
}
AssignProcessToJobObject(hJob, hProc);

Hierarchy diagram illustrating how the kernel computes the effective limit as the minimum across a nested job chain before applying it to a member process
Nested jobs only tighten constraints — the kernel enforces the most restrictive ancestor limit at every level.

7. Inspecting Jobs at Runtime

Process Explorer and Process Hacker display a process’s job membership and its limits on a dedicated Job tab. WinObj reveals named job objects in the Object Manager namespace. In kernel debugging, walk and dump jobs directly.

0: kd> !process 0 0 notepad.exe          ; find the EPROCESS
0: kd> dt nt!_EPROCESS Job <EPROCESS>    ; read the Job pointer
0: kd> !job <EJOB-address>               ; dump limits and member list
0: kd> dt nt!_EJOB JobFlags              ; locate the silo/flags field

These are observation tools, not attack tooling — they let an analyst confirm exactly which processes share a job and what limits are in force.


8. Silos: From Jobs to Containers

Jobs alone do not isolate the namespace — they constrain resources but not what a process can name or see. Microsoft solved this with silos, effectively “super jobs.” A silo is a job object with the Silo flag set in the EJOB.JobFlags field.

There are two silo types:

Silo TypeUsePrivilege
Application siloDesktop Bridge / MSIX app isolationStandard
Server siloWindows (Docker) container supportAdministrator

When a silo is created, the kernel builds it its own root directory object, distinct from the host root — giving the silo a private object namespace. A server silo further owns an _ESERVERSILO_GLOBALS structure holding container-specific state, and is backed by a virtual disk, a registry hive, and a virtual network adapter.

Kernel FunctionPurpose
PsCreateSilo / PsCreateServerSiloCreate silo / server silo objects
PsAttachSiloToCurrentThread / PsDetachSiloFromCurrentThreadBind/unbind a thread to a silo context
PsGetThreadServerSiloReturn the server silo a thread runs in
PsIsCurrentThreadInServerSiloBoolean gate used to restrict syscalls inside a container
; For understanding only — JobFlags layout is build-specific and undocumented.
0: kd> dt nt!_EJOB JobFlags
   +0x0?? JobFlags : Uint4B    ; a bit in this field marks the job as a silo

The _EJOB, _ESERVERSILO_GLOBALS, and JobFlags offsets are undocumented and shift between OS builds. Validate them against your target build with WinDbg dt before treating any offset as authoritative.


Hierarchy diagram showing the progression from a plain Job Object to a Silo with a private namespace, and further to a Server Silo owning container-specific state including registry hive and virtual network adapter
Silos extend job objects with namespace isolation; server silos layer on full container state to back Windows Server containers.

9. Windows Containers and the Host Compute Service

Windows Server containers are built on server silos. The Host Compute Service (HCS) orchestrates their lifecycle, wiring up the silo’s job-object resource controls, registry hive virtualization, and filesystem isolation. The filesystem layer is enforced by wcifs.sys, the Windows Container Isolation Filter Driver, which projects the container’s view over the host volume.

ModeBoundaryNotes
--isolation=processServer silo, shared host kernelLighter, but escapes reach the host kernel
--isolation=hypervUtility VM + inner job objectVM enforces limits even if the inner job is escaped

Process isolation shares the host kernel, which makes server-silo escape research directly relevant to defenders. Hyper-V isolation applies controls at both the VM and the inner container job object — a job escape still cannot exceed VM-level limits.


Flow diagram showing the Host Compute Service orchestrating a Server Silo, which interacts with the wcifs.sys isolation filter driver, with an optional Hyper-V VM layer applying additional limits
The HCS wires together the server silo, wcifs.sys filesystem filter, and optional Hyper-V VM boundary to form a complete Windows container stack.

10. Common Attacker Techniques

TechniqueDescription
Sandbox-aware keyingPayload detects a constrained job (low ActiveProcessLimit, tight memory cap) and alters behaviour to evade analysis
Debugger / UI blockingSetting JOB_OBJECT_UILIMIT_HANDLES or JOB_OBJECT_UILIMIT_EXITWINDOWS to deny security-tool UI/handle access within the job
Breakaway abuseUsing JOB_OBJECT_LIMIT_BREAKAWAY_OK so child processes escape a controlling job’s limits and accounting
Child-tree concealmentWrapping persistent processes in a job to manage and hide their descendant trees
Container / silo escapeBreaking out of a server silo’s namespace root to reach the host OS

Adversaries also use the native API directly — CreateJobObject, AssignProcessToJobObject, SetInformationJobObject — to construct their own sandboxes around tooling, or to apply quotas that frustrate dynamic analysis.


11. Defensive Strategies & Detection

There is no dedicated Sysmon event for CreateJobObject or AssignProcessToJobObject as of Sysmon v15 — job manipulation is caught indirectly via process access, process creation, and ETW.

Sysmon Event IDRelevance
1 (Process Create)Children spawned under sandboxed jobs; correlate unusual ParentImage / IntegrityLevel
10 (Process Access)OpenProcess with PROCESS_SET_QUOTA (0x200) or PROCESS_ALL_ACCESS (0x1fffff) preceding job assignment
17 / 18 (Pipe Created/Connected)Named pipes visible across a silo namespace boundary during lateral movement
ETW ProviderWhat It Logs
Microsoft-Windows-Kernel-ProcessProcess/thread lifecycle; job assignments surface as ProcessSetJobObjectInformation events
Microsoft-Windows-Security-AuditingProcess creation (Event 4688 with command-line auditing)
Microsoft-Windows-Containers-CCGContainer credential guard events in server silos
Microsoft-Windows-Hyper-V-ComputeHCS / silo creation and teardown

Enable Audit Process Creation (auditpol /set /subcategory:"Process Creation" /success:enable) to produce Event 4688 with full command line, and Audit Object Access to capture named job-object handle creation as Events 4656 / 4663.

title: Suspicious Process Access Preceding Job Quota Assignment
logsource:
  product: windows
  service: sysmon
detection:
  selection:
    EventID: 10                 # Sysmon ProcessAccess
    GrantedAccess|contains:
      - '0x1fffff'              # PROCESS_ALL_ACCESS
      - '0x200'                 # PROCESS_SET_QUOTA (job assignment)
    TargetImage|contains: '\lsass.exe'
  condition: selection
level: high

Hardening guidance:

  • Apply JOB_OBJECT_LIMIT_KILL_ON_JOB_CLOSE in every sandbox so process trees are reaped on handle loss.
  • Deny JOB_OBJECT_LIMIT_BREAKAWAY_OK unless explicitly required — it is a direct escape vector.
  • Combine job limits with Integrity Levels and AppContainer; jobs do not restrict file or registry access.
  • For hostile workloads prefer Hyper-V isolation — controls apply to both the VM and the inner job object.
  • Monitor wcifs.sys activity in server-silo environments; it enforces filesystem isolation and is a known escape surface.
  • Audit named job creation under \Sessions\<N>\BaseNamedObjects\ with WinObj and Sysmon object/pipe events as a proxy.

12. MITRE ATT&CK Mapping

TechniqueMITRE IDDetection
Native APIT1106ETW Kernel-Process job-assignment events; underpins all job/silo API use
Process InjectionT1055Sysmon Event ID 10; handle access to constrained process groups
Impair Defenses: Disable/Modify ToolsT1562.001UI-limit flags blocking security tooling; behavioural EDR telemetry
Escape to HostT1611wcifs.sys and Hyper-V-Compute ETW; primary silo/container-escape mapping
Create or Modify System ProcessT1543Sysmon Event ID 1; persistent processes wrapped in jobs
Execution GuardrailsT1480Behavioural analysis of sandbox-aware payloads keyed to job limits

Verify current technique versions and sub-techniques at https://attack.mitre.org before publication.


13. Tools for Job and Silo Analysis

ToolDescriptionLink
Process ExplorerView per-process job membership and limitssysinternals
Process HackerInspect job tab, member processes, and quotasprocesshacker.sourceforge.io
WinObjBrowse named job objects and silo namespace rootssysinternals
WinDbg!job, dt nt!_EJOB, _ESERVERSILO_GLOBALS inspectionmicrosoft.com
Process MonitorObserve wcifs.sys and registry-hive container activitysysinternals
ETW (logman / wevtutil)Capture Kernel-Process and Hyper-V-Compute eventsmicrosoft.com

Summary

  • Job objects group processes into a single managed unit with enforceable CPU, memory, network, and process-count limits, all anchored on the kernel EJOB object.
  • Limits are applied through SetInformationJobObject using JOBOBJECT_BASIC, EXTENDED, CPU_RATE, NET_RATE, and NOTIFICATION structures; nesting (Windows 8+) tightens to the most restrictive ancestor.
  • Silos extend jobs via the JobFlags silo bit, adding a private object-namespace root; server silos (_ESERVERSILO_GLOBALS) back Windows containers and share the host kernel.
  • Abuse spans sandbox-aware keying, BREAKAWAY_OK escapes, UI-limit tool blocking, and server-silo container escape (T1611).
  • Detect via Sysmon Event ID 1/10, Kernel-Process and Hyper-V-Compute ETW, Event 4688 auditing, and prefer Hyper-V isolation plus KILL_ON_JOB_CLOSE for containment.

Related Tutorials

References

Windows Scheduler Internals: Priority Levels, Quantum, and Thread Selection

Objective: Understand how the Windows kernel selects, preempts, and rotates threads — the 32-level priority model, dispatcher ready queues, quantum accounting, boost/decay logic, and the multiprocessor dispatch path — so defenders can baseline normal scheduling behavior and detect attacker manipulation of priority and affinity.


1. The Scheduling Contract: Threads, Not Processes

Windows schedules threads, not processes. Every executable unit of work is represented by a KTHREAD (the Thread Control Block embedded in ETHREAD.Tcb), and the scheduler operates exclusively against that structure. A process supplies the address space, the base priority class, the quantum reset value, and the affinity mask — but it never itself runs on a CPU.

Scheduling is preemptive and priority-based with round-robin at the highest priority. Two rules dominate:

  • The thread with the highest priority in the Ready state always wins.
  • If a running thread has a lower priority than a newly Ready thread, the running thread is immediately preempted at the next dispatch point.

Quantum only matters as a tiebreaker between threads of the same highest priority — it does not arbitrate across priority levels.


2. The 32-Level Priority Model and Priority Classes

Priorities range from 0 (zero-page thread only) to 31 (highest real-time). The space splits into two bands with very different semantics.

RangeTypeDescription
0Zero-page threadReserved for the memory zero-page thread
1–15Dynamic (variable)Normal user-mode threads; subject to boost/decay
16–31Real-timeFixed priorities; no boost, no decay; drivers and RT tasks

Win32 exposes two functions to set scheduling parameters: SetPriorityClass on the process and SetThreadPriority on the thread. The two combine to produce the thread’s base priority in the kernel.

SetPriorityClass constantClassBase priority range
IDLE_PRIORITY_CLASSIdle1–6
BELOW_NORMAL_PRIORITY_CLASSBelow Normal4–9
NORMAL_PRIORITY_CLASSNormal6–10
ABOVE_NORMAL_PRIORITY_CLASSAbove Normal8–13
HIGH_PRIORITY_CLASSHigh11–15
REALTIME_PRIORITY_CLASSReal-time16–31

Crossing into the real-time band (>=16) requires the SeIncreaseBasePriorityPrivilege privilege. NT-native equivalents are NtSetInformationThread (information class ThreadBasePriority = 3) and ZwSetInformationProcess.

// Pin this process and one of its threads to real-time scheduling.
SetPriorityClass(GetCurrentProcess(), REALTIME_PRIORITY_CLASS);
SetThreadPriority(GetCurrentThread(), THREAD_PRIORITY_TIME_CRITICAL);
// Base priority now sits at 31 — preempts essentially everything in user mode.

Hierarchy diagram of the Windows 32-level thread priority model split into real-time band (16–31) and dynamic band (1–15) with priority 0 reserved for the zero-page thread
Windows priorities split at level 16 — crossing into the real-time band requires SeIncreaseBasePriorityPrivilege, and those threads are never boosted or decayed.

3. Key Kernel Structures

Three structures carry the scheduler’s state: _KTHREAD per thread, _KPROCESS per process, and _KPRCB per logical processor.

_KTHREAD (Thread Control Block)

typedef struct _KTHREAD {
    DISPATCHER_HEADER  Header;          // +0x000 dispatcher object header
    // ...
    ULONGLONG          QuantumTarget;   // +0x020 quantum expiration target
    PVOID              InitialStack;    // +0x028 top of kernel stack
    // ...
    volatile UCHAR     State;           // Ready/Running/Waiting/...
    BOOLEAN            Preempted;
    UCHAR              DeferredProcessor;
    SCHAR              Priority;        // current (dynamic) priority
    ULONG              WaitTime;
    LIST_ENTRY         WaitListEntry;
    SINGLE_LIST_ENTRY  SwapListEntry;
    KSPIN_LOCK         ThreadLock;
} KTHREAD, *PKTHREAD;

The embedded DISPATCHER_HEADER is the same header found at the top of every waitable kernel object and is what ties the thread into wait queues.

_KPRCB (Kernel Processor Control Block)

Each logical processor has a KPCR; inside it sits a KPRCB carrying that CPU’s scheduling state.

typedef struct _KPRCB {
    // ...
    PKTHREAD    CurrentThread;       // executing thread on this CPU
    PKTHREAD    NextThread;          // pending preemption candidate
    PKTHREAD    IdleThread;          // per-CPU idle thread
    LIST_ENTRY  ReadyListHead[32];   // dispatcher ready queues (per priority)
    ULONG       ReadySummary;        // bitmask of non-empty ready queues
    // ...
} KPRCB, *PKPRCB;

_KPROCESS (Process Control Block)

Embedded as EPROCESS.Pcb, it provides the per-process scheduling defaults:

FieldPurpose
BasePriorityProcess base priority; seeds new threads
QuantumResetQuantum value assigned to new threads
ThreadListHeadDoubly-linked list of all _KTHREADs in the process
ReadyListHeadReady-but-swapped-out threads

4. Dispatcher Ready Queues and ReadySummary

The Dispatcher Ready Queue is the per-CPU array KPRCB.ReadyListHead[32] — one LIST_ENTRY per priority level. Each non-empty entry is a FIFO of KTHREAD structures in the Ready state.

To avoid scanning all 32 queues, the kernel maintains a 32-bit ReadySummary bitmask: bit n is set when ReadyListHead[n] is non-empty. The dispatcher then selects the next thread in O(1):

// Conceptual scheduler inner loop (pseudo-code; not a real symbol).
ULONG mask = Prcb->ReadySummary;
if (mask) {
    ULONG idx;
    _BitScanReverse(&idx, mask);              // highest set bit = top priority
    PKTHREAD next = CONTAINING_RECORD(
        RemoveHeadList(&Prcb->ReadyListHead[idx]),
        KTHREAD, WaitListEntry);
    if (IsListEmpty(&Prcb->ReadyListHead[idx]))
        Prcb->ReadySummary &= ~(1u << idx);
    return next;
}
return Prcb->IdleThread;

5. Quantum Mechanics

A quantum is the slice of CPU time a thread is allowed to consume before the scheduler considers rotating it. WMI exposes two relevant properties: QuantumLength (clock ticks per quantum) and QuantumType (fixed vs. variable). Windows client SKUs default to variable quantum, giving the foreground process longer slices; server SKUs default to fixed long quantum to favor batch throughput.

Internally, quantum is tracked in units of 3 per clock tick — a “full” quantum is 18 units (client) or 36 units (server). KTHREAD.QuantumTarget holds the cycle target; on each clock tick, the kernel decrements and, on expiry, transfers control to KiQuantumEnd().

The foreground boost is governed by the registry value:

HKLM\SYSTEM\CurrentControlSet\Control\PriorityControl\Win32PrioritySeparation

The lowest six bits encode foreground-vs-background quantum behavior; bits 0–1 specifically choose the foreground boost level (0 none, 1 medium, 2 high). The kernel mirrors this into the global PsPrioritySeparation.

Internal scheduler routines you will see in symbols:

FunctionPurpose
KiQuantumEndInvoked at clock interrupt when quantum expires
KiSelectNextThreadSelects next Ready thread for the current CPU
KiDeferredReadyThreadPlaces a thread in DeferredReady before final dispatch
KxQueueReadyThreadInserts a thread into the per-CPU ready queue
KiReadyThreadTransitions a thread to the Ready state
KiSwapThread / KiSwapContextPerforms the actual context switch

6. Thread Selection: The Dispatch Path

A typical preemption follows this path:

  1. Clock interrupt fires on the local CPU.
  2. KiQuantumEnd() decrements KTHREAD.Quantum; if it has reached zero, the thread is moved out of Running.
  3. KiSelectNextThread() consults KPRCB.ReadySummary to find the highest non-empty queue.
  4. The chosen thread is removed from ReadyListHead[idx] and routed through KiDeferredReadyThread().
  5. KxQueueReadyThread() places the preempted thread back into ReadyListHead[oldPrio] (FIFO tail) so round-robin holds within its level.
  6. KiSwapThread()KiSwapContext() saves outgoing register state, loads the incoming thread’s stack and registers, and returns into the new thread.

If a wake event makes a higher-priority thread Ready while another thread is Running, the dispatcher instead writes the candidate into KPRCB.NextThread, raises an IPI on the target CPU, and the preemption fires on return-from-interrupt — without waiting for quantum expiry.


Flow diagram showing the Windows thread dispatch sequence from clock interrupt through KiQuantumEnd, KiSelectNextThread scanning ReadySummary, dequeueing from ReadyListHead, and KiSwapContext to the new running thread
The O(1) dispatch path uses a highest-set-bit scan on KPRCB.ReadySummary to find the next ready thread without iterating all 32 queues.

7. Priority Boosts and Decay

Dynamic-band threads (1–15) do not stay at their base priority. The kernel temporarily boosts them in response to events and decays the boost as they consume CPU.

EventBoost
I/O completion (keyboard / mouse)+6
I/O completion (disk / network)+1
Foreground window activationcontrolled by PsPrioritySeparation
Wait satisfied on executive event+1
Starvation avoidance (Balance Set Manager)up to 15 for one quantum
Decay (CPU-bound thread at quantum end)−1 toward base

The Balance Set Manager (KeBalanceSetManager) periodically scans ready queues and elevates threads that have been Ready but never running for ~4 seconds to priority 15 for a single quantum — preventing indefinite starvation by higher-priority work. Real-time threads (16–31) are never boosted or decayed; their priority is exactly what was set.


8. Multiprocessor Scheduling, Affinity, and NUMA

Each CPU has its own ready queues, so dispatch decisions are mostly local. To preserve cache and NUMA locality, the scheduler picks an ideal processor per thread and prefers to dispatch on that CPU, falling back to other CPUs in the thread’s affinity mask when the ideal is busy.

// Pin a worker thread to CPU 2, with CPU 2 as its ideal processor.
DWORD_PTR mask = (DWORD_PTR)1 << 2;
SetThreadAffinityMask(hThread, mask);
SetThreadIdealProcessor(hThread, 2);

For >64 logical processors, threads belong to processor groups, set via SetThreadGroupAffinity. Kernel-mode equivalents are KeSetSystemAffinityThread and KeSetIdealProcessorThread. Misconfigured affinity is a real performance and detection hazard — a thread pinned off-node walks remote memory and pollutes another CPU’s cache.


9. Thread States: The Full State Machine

The KTHREADSTATE enum tracks every transition. The values you will see in KTHREAD.State:

StateMeaning
InitializedThread structure created, not yet schedulable
ReadySchedulable; sitting on ReadyListHead[priority]
StandbySelected as KPRCB.NextThread, about to run
RunningCurrently executing on a CPU
WaitingBlocked on a dispatcher object
TransitionWait satisfied, but kernel stack is paged out
DeferredReadyWill be made Ready on a specific CPU
TerminatedFinal state before structure teardown

A normal cycle looks like Initialized → Ready → Standby → Running → Waiting → Ready …. KPRCB.NextThread is non-NULL exactly while a target CPU has a Standby thread queued.


Graph diagram of the Windows KTHREAD state machine showing transitions between Initialized, Ready, Standby, Running, Waiting, and Terminated states
A thread passes through Standby — held in KPRCB.NextThread — immediately before swapping onto the CPU, making Standby a precise indicator of imminent context switch.

10. Observing the Scheduler with WinDbg and ETW

Live kernel inspection in WinDbg:

0: kd> !pcr                          ; current processor control region
0: kd> !prcb                         ; current processor control block
0: kd> dt nt!_KPRCB CurrentThread NextThread ReadySummary @$prcb
0: kd> dt nt!_KTHREAD Priority Quantum State Preempted @$thread
0: kd> !ready                        ; all ready threads, sorted by priority
0: kd> !thread <addr> 1f             ; full thread state including stack

The ReadyListHead walk per-priority:

0: kd> dx -r1 ((nt!_KPRCB*)@$prcb)->ReadyListHead
0: kd> !list "-t nt!_KTHREAD.WaitListEntry.Flink -e -x \"dt nt!_KTHREAD @$extret Priority\" \
        ((nt!_KPRCB*)@$prcb)->ReadyListHead[15].Flink"

For live system-wide capture, use ETW:

xperf -on PROC_THREAD+LOADER+CSWITCH+DISPATCHER -stackwalk CSwitch
xperf -d sched.etl

The primary providers carrying scheduler telemetry:

ProviderGUIDKey events
Microsoft-Windows-Kernel-Process{22FB2CD6-0E7B-422B-A0C7-2FAD1FD0E716}CSwitch (36), ReadyThread (50)
Microsoft-Windows-Kernel-Thread{3D6FA8D1-FE05-11D0-9DDA-00C04FD7BA7C}Thread create/terminate, priority change
NT Kernel Logger{9E814AAD-3204-11D2-9A82-006008A86939}CSWITCH, DISPATCHER groups

A user-mode helper to enumerate per-thread priority without OpenProcess:

import ctypes
from ctypes import wintypes

ntdll = ctypes.WinDLL("ntdll")
# SystemProcessInformation = 5; walks _SYSTEM_PROCESS_INFORMATION entries
# Each entry trails an array of SYSTEM_THREAD_INFORMATION with Priority/BasePriority.
buf = (ctypes.c_byte * (1024 * 1024))()
ret_len = wintypes.ULONG()
ntdll.NtQuerySystemInformation(5, buf, ctypes.sizeof(buf), ctypes.byref(ret_len))
# parse _SYSTEM_PROCESS_INFORMATION + _SYSTEM_THREAD_INFORMATION here

11. Common Attacker Techniques

Scheduler manipulation is rarely a standalone objective — it is a force multiplier for injection, evasion, and defense impairment.

TechniqueDescription
Thread execution hijackingOpenThreadSuspendThreadVirtualAllocEx + WriteProcessMemorySetThreadContextResumeThread. Post-resume, attacker controls priority and CPU affinity of the hijacked thread.
Real-time priority abuseSet malicious thread to THREAD_PRIORITY_TIME_CRITICAL under REALTIME_PRIORITY_CLASS (priority 31) to dominate the CPU and starve EDR scanners. Requires SeIncreaseBasePriorityPrivilege.
EDR/AV starvationOpen handles to defender process threads with THREAD_SET_INFORMATION and downgrade them via SetThreadPriority(THREAD_PRIORITY_IDLE) to delay real-time detection.
Affinity pinning for evasionPin malicious threads to a CPU not covered by an EDR’s per-CPU sampling profiler, or off-NUMA-node, to skew profilers and ETW stack walks.
Win32PrioritySeparation tamperingModify the registry value to alter foreground boost behavior, hurting interactive defensive tooling.
Quantum throttling via Job ObjectsApply JOB_OBJECT_CPU_RATE_CONTROL to constrain a defender process’s CPU budget.

Conceptual illustration of attacker thread priority manipulation showing a high-priority red thread overwhelming lower-priority blue threads on a CPU grid
Elevating a malicious thread to real-time priority can starve EDR sensor threads, delay telemetry, and create execution windows for in-memory payloads.

12. Defensive Strategies & Detection

Scheduler-level abuse is observable through ETW context-switch streams, sensitive-privilege auditing, registry auditing, and process-access telemetry. Sysmon alone is insufficient — pair it with kernel ETW.

Sysmon Event IDNameRelevance
1Process CreateCaptures process priority class and parent lineage
8CreateRemoteThreadCross-process thread creation; often precedes priority manipulation
10ProcessAccessOpenThread with THREAD_SET_INFORMATION indicates intent to alter priority/context
13RegistryValueSetModification of Win32PrioritySeparation and other PriorityControl values

Critical Windows audit events:

  • 4673 — Sensitive Privilege Use. Catches SeIncreaseBasePriorityPrivilege invocation, required for real-time priority.
  • 4656 / 4663 — Handle/Object Access. Catches handles opened to thread objects with THREAD_SET_INFORMATION.
  • 4657 — Registry value modified. Catches Win32PrioritySeparation changes.
  • 4688 — Process creation (with command-line auditing enabled).

Conceptual Sigma rule for unexpected real-time priority use:

title: Sensitive Privilege Use - SeIncreaseBasePriorityPrivilege from Non-System Process
logsource:
  product: windows
  service: security
detection:
  selection:
    EventID: 4673
    PrivilegeList|contains: 'SeIncreaseBasePriorityPrivilege'
  filter_system:
    SubjectUserSid:
      - 'S-1-5-18'   # LocalSystem
      - 'S-1-5-19'   # LocalService
      - 'S-1-5-20'   # NetworkService
  condition: selection and not filter_system
level: high

Hardening checklist:

  1. Restrict SeIncreaseBasePriorityPrivilege via Group Policy → User Rights Assignment to only the accounts that require it.
  2. Audit Win32PrioritySeparation with Sysmon Event ID 13 or registry SACL → Event ID 4657.
  3. Baseline CSwitch priority distributions via ETW; alert on sustained user-mode threads scheduled at priority ≥ 16 outside an allowlist.
  4. Deploy EDR that registers PsSetCreateThreadNotifyRoutine and ObRegisterCallbacks to observe thread creation, handle stripping, and priority changes in kernel.
  5. Enclose untrusted code in Job Objects with JobObjectCpuRateControlInformation and basic UI restrictions to prevent it from starving other processes.

13. Tools for Scheduler Analysis

ToolDescriptionLink
WinDbg (kernel)!ready, !thread, !pcr, !prcb, dt nt!_KTHREAD/_KPRCB for live scheduler inspectionlearn.microsoft.com
Windows Performance Recorder / xperfCaptures CSwitch, ReadyThread, DISPATCHER ETW events with stack walkslearn.microsoft.com
Windows Performance AnalyzerVisualizes CPU usage, context switches, and per-thread priority timelineslearn.microsoft.com
Process Hacker / System InformerLive per-thread state, base priority, dynamic priority, ideal CPU, affinity masksysteminformer.sourceforge.io
Process ExplorerPer-thread CPU, priority class, kernel/user stackssysinternals.com
Process MonitorCaptures Process Create, registry writes (Win32PrioritySeparation)sysinternals.com
SysmonEvents 1, 8, 10, 13 for thread creation, cross-process access, registry editssysinternals.com
Volatility 3Offline thread enumeration (windows.threads) and priority analysis from memory dumpsvolatilityfoundation.org

14. MITRE ATT&CK Mapping

TechniqueMITRE IDDetection
Process InjectionT1055Sysmon 10 (ProcessAccess), ETW thread create with foreign-process parentage
Thread Execution HijackingT1055.003Sysmon 10 with THREAD_SET_INFORMATION / THREAD_SET_CONTEXT access; SuspendThread/ResumeThread pairs in EDR telemetry
Scheduled Task / JobT1053Audit 4698 for task creation; monitor Job Object CPU-rate limits applied to defensive processes
Impair Defenses: Disable or Modify ToolsT1562.001Sysmon 10 against AV/EDR lsass.exe, MsMpEng.exe with THREAD_SET_INFORMATION; priority drops via ETW Microsoft-Windows-Kernel-Thread

Note: ATT&CK does not currently track “Thread Priority Manipulation” as a standalone technique. Treat priority abuse as a sub-mechanism of T1055.003 and T1562.001.


15. Summary

  • Windows is a preemptive, priority-based thread scheduler with 32 levels and per-CPU ready queues — priority always wins, quantum only rotates equal-priority threads.
  • The dispatcher uses KPRCB.ReadySummary plus ReadyListHead[32] to pick the next thread in O(1) via highest-set-bit scan.
  • Quantum is tracked in 3-unit-per-tick increments on KTHREAD.QuantumTarget, with foreground boost governed by Win32PrioritySeparation / PsPrioritySeparation.
  • Dynamic threads (1–15) are subject to I/O, foreground, and starvation boosts plus decay; real-time threads (16–31) are not.
  • Attackers abuse the scheduler via thread hijacking, real-time priority escalation, EDR starvation, and affinity pinning — detect via ETW CSwitch, Sysmon 8/10/13, and Event ID 4673 for SeIncreaseBasePriorityPrivilege.

Related Tutorials

References

APCs: Asynchronous Procedure Calls and Thread Hijacking Surface

Objective: Understand the Windows Asynchronous Procedure Call mechanism from the kernel up — the KAPC / KAPC_STATE structures, the dispatch path through KiInsertQueueApc and KiDeliverApc, the alertable-wait requirement, and the three abuse variants (classic, early-bird, special user APC) used for thread hijacking and process injection — and detect them with Sysmon, ETW-TI, and audit policy.


1. APC Fundamentals — What the OS Actually Uses APCs For

An Asynchronous Procedure Call is a function that executes asynchronously in the context of a specific thread. When the kernel queues an APC, it raises a software interrupt and arranges for the routine to run the next time that thread is dispatched. Every thread has its own APC queue — APCs are inherently thread-targeted, which is exactly why offensive tooling loves them.

The OS itself relies on APCs for normal work:

  • I/O completion: ReadFileEx, WriteFileEx, and SetWaitableTimer deliver their completion callback via a user-mode APC queued back to the issuing thread.
  • File-system filter callbacks: normal kernel APCs are widely used by file systems and minifilters.
  • Wait abortion: queuing a user APC against a thread in an alertable wait satisfies the wait with STATUS_USER_APC.

Understanding APCs means understanding three things in sequence: who can queue them, when they fire, and what the thread looks like at the moment they fire.


2. The Three Flavours of APCs

APCs differ by IRQL and by who is allowed to queue them. The kernel maintains distinct semantics for each.

TypeIRQLNotes
Special Kernel APCAPC_LEVELRuns in kernel mode at IRQL APC_LEVEL; preempts user-mode code and kernel-mode code executing at PASSIVE_LEVEL. Used by the OS for operations such as I/O request completion.
Normal Kernel APCPASSIVE_LEVELRuns in kernel mode at PASSIVE_LEVEL; preempts all user-mode code, including user APCs. Generally used by file systems and file-system filter drivers.
User-mode APCPASSIVE_LEVELGenerated by an application. The target thread must be in an alertable state for a user-mode APC to run.

Unlike deferred procedure calls (DPCs), which run in arbitrary thread context, an APC always executes inside a specific thread’s context — that property is what makes APCs both useful for I/O completion and dangerous as an injection primitive.


Hierarchy diagram showing the three APC types: Kernel-Mode, User-Mode, and Special User APC, with their respective queuing APIs and alertable-wait requirements
The three APC flavours differ by privilege level, delivery trigger, and the Win32/native APIs used to queue them.

3. Kernel Structures: KAPC, KAPC_STATE, KTHREAD

A queued APC is represented in the kernel by a KAPC object. The thread tracks its pending APCs via a KAPC_STATE embedded in KTHREAD.

// Conceptual layout — field names are illustrative; confirm against the
// target Windows build with `dt nt!_KAPC` / `dt nt!_KAPC_STATE` in WinDbg.

typedef struct _KAPC {
    UCHAR              Type;
    UCHAR              SpareByte0;
    UCHAR              Size;
    UCHAR              SpareByte1;
    ULONG              SpareLong0;
    struct _KTHREAD   *Thread;
    LIST_ENTRY         ApcListEntry;
    PKKERNEL_ROUTINE   KernelRoutine;
    PKRUNDOWN_ROUTINE  RundownRoutine;
    PKNORMAL_ROUTINE   NormalRoutine;
    PVOID              NormalContext;
    PVOID              SystemArgument1;
    PVOID              SystemArgument2;
    CCHAR              ApcStateIndex;
    KPROCESSOR_MODE    ApcMode;
    BOOLEAN            Inserted;
} KAPC, *PKAPC;

typedef struct _KAPC_STATE {
    LIST_ENTRY         ApcListHead[2];   // [0] = kernel APCs, [1] = user APCs
    struct _KPROCESS  *Process;
    BOOLEAN            KernelApcInProgress;
    BOOLEAN            KernelApcPending;
    BOOLEAN            UserApcPending;
    // SpecialUserApcPending was added later for RS5+ Special User APCs.
} KAPC_STATE, *PKAPC_STATE;

Key fields the dispatcher and attackers both care about:

  • KAPC.NormalRoutine — the function the thread will eventually execute.
  • KAPC.NormalContext, SystemArgument1, SystemArgument2 — arguments passed to NormalRoutine.
  • KAPC.ApcModeKernelMode vs UserMode, controls which queue and which delivery path.
  • KAPC_STATE.ApcListHead[2] — two doubly-linked lists; index 0 holds kernel-mode APCs, index 1 holds user-mode APCs.
  • KAPC_STATE.UserApcPending — set to TRUE when a user APC is queued and the thread is in an alertable wait; this is the signal that breaks the wait with STATUS_USER_APC.

4. The Alertable Wait Requirement

A user-mode APC does not fire whenever the kernel wants — it fires only when the target thread is willing to be interrupted. A thread enters an alertable state by calling one of:

  • SleepEx()
  • SignalObjectAndWait()
  • MsgWaitForMultipleObjectsEx()
  • WaitForMultipleObjectsEx()
  • WaitForSingleObjectEx()

with the bAlertable parameter set to TRUE. Additionally, ReadFileEx, WriteFileEx, and SetWaitableTimer are themselves implemented using APCs as their completion-notification mechanism — so threads driving overlapped I/O routinely sit in alertable waits.

This alertable-state requirement is the single most important property to understand offensively and defensively:

  • Offensively, it dictates target selection. Long-lived service threads in svchost.exe or explorer.exe that pump I/O are reliable targets; threads that never enter an alertable wait will never run a queued user APC.
  • Defensively, it explains why the classic injection works against some processes and not others — and why attackers eventually moved to Special User APCs to remove the dependency entirely (§9).

5. Win32 → Native → Kernel Call Chain

Queuing a user APC traverses three layers.

API / SymbolLayerDescription
QueueUserAPCWin32 (kernel32.dll)Queues a user-mode APC to a target thread.
NtQueueApcThreadNT native (ntdll.dll)Syscall used internally by QueueUserAPC to deliver the APC.
NtQueueApcThreadExNT nativeExtended form; RS5 introduced Special User APCs queued by passing 1 as the reserve handle.
NtQueueApcThreadEx2NT nativeNewer variant exposing both UserApcFlags and MemoryReserveHandle.
QueueUserAPC2kernelbase.dllWrapper that exposes Special User APCs to user code.
KeInsertQueueApcKernelAttaches the initialized KAPC to the target thread’s queue.
KiDeliverApcKernelDispatches pending APCs at the kernel→user transition.
ntdll!RtlDispatchAPCntdllTrampoline in user mode that calls the caller-supplied APCProc.

An important internal detail: when you call QueueUserAPC(pfn, hThread, dwData), the function pointer ntdll actually hands to NtQueueApcThread is not your pfn — it is ntdll!RtlDispatchAPC, and your pfn is passed as a parameter. This is why call-stack-aware EDRs frequently see RtlDispatchAPC as the immediate caller of the suspicious user-mode routine.

The dispatch sequence for a user-mode APC:

  1. Caller obtains a thread handle with THREAD_SET_CONTEXT access.
  2. QueueUserAPCNtQueueApcThread → kernel enters KiInsertQueueApc.
  3. KiInsertQueueApc checks whether the target is in an alertable wait with WaitMode == UserMode. If yes, it sets UserApcPending = TRUE and completes the wait with STATUS_USER_APC.
  4. On the kernel→user transition, KiDeliverApc redirects execution to ntdll!RtlDispatchAPC, which invokes the original APCProc.

Flow diagram of the APC dispatch chain from QueueUserAPC through NtQueueApcThread, KiInsertQueueApc, KiDeliverApc, RtlDispatchAPC, to the final APCProc callback
Every layer of the APC dispatch chain is observable; EDRs see RtlDispatchAPC as the immediate caller of the injected routine.

6. Inspecting APC State in WinDbg

Read-only kernel introspection lets defenders and learners watch the structures the dispatcher mutates.

0: kd> !process 0 0 lsass.exe
0: kd> .process /r /p <EPROCESS>
0: kd> !thread <ETHREAD>

0: kd> dt nt!_KTHREAD <addr> ApcState
0: kd> dt nt!_KAPC_STATE <addr+offset>
   +0x000 ApcListHead       : [2] _LIST_ENTRY
   +0x020 Process           : Ptr64 _KPROCESS
   +0x028 KernelApcInProgress : UChar
   +0x029 KernelApcPending  : UChar
   +0x02a UserApcPending    : UChar

0: kd> !list "-t nt!_KAPC.ApcListEntry.Flink -e -x \"dt nt!_KAPC @$extret\" <ApcListHead[1]>"

Walking ApcListHead[1] for any thread reveals every pending user APC — its NormalRoutine, NormalContext, and ApcMode. On a healthy thread you typically see nothing; finding NormalRoutine pointing into a private RX region inside a system process is a classic incident-response artifact.


7. Classic APC Injection

The textbook variant. Every API call below is observable; the technique relies entirely on existing, documented APIs.

// Educational illustration of the API call chain only.
// No payload is included; `payload` is a placeholder used by defenders to
// recognize the pattern. Authorized testing only.

#include <windows.h>
#include <tlhelp32.h>

BOOL InjectViaAPC(DWORD pid, DWORD tid, const BYTE *payload, SIZE_T cb) {
    HANDLE hProc = OpenProcess(
        PROCESS_VM_OPERATION | PROCESS_VM_WRITE | PROCESS_QUERY_INFORMATION,
        FALSE, pid);
    if (!hProc) return FALSE;

    HANDLE hThread = OpenThread(THREAD_SET_CONTEXT, FALSE, tid);
    if (!hThread) { CloseHandle(hProc); return FALSE; }

    LPVOID remote = VirtualAllocEx(hProc, NULL, cb,
                                   MEM_COMMIT | MEM_RESERVE,
                                   PAGE_EXECUTE_READWRITE);
    WriteProcessMemory(hProc, remote, payload, cb, NULL);

    // QueueUserAPC schedules execution; it fires only when the target
    // thread enters an alertable wait.
    QueueUserAPC((PAPCFUNC)remote, hThread, 0);

    CloseHandle(hThread);
    CloseHandle(hProc);
    return TRUE;
}

Trigger conditions:

  • The target thread (tid) must enter an alertable wait. In long-lived service hosts this happens routinely.
  • The handle to the thread must carry THREAD_SET_CONTEXT. This is the most reliable single indicator: Sysmon EID 10 with a GrantedAccess mask covering THREAD_SET_CONTEXT against a high-value target image is the canonical detection (§12).

Notably, no new thread is created in the victim processCreateRemoteThread is not called. This is exactly why APC injection evades Sysmon EID 8.


8. Early-Bird APC Injection

Classic injection has one weakness: you cannot predict when the victim thread will next become alertable. Early-bird removes the guesswork by injecting into a process you create yourself in a suspended state, then queuing the APC against the main thread before it has executed a single instruction.

// Educational pseudocode — illustrates API sequence, not payload.

STARTUPINFOA si = { sizeof(si) };
PROCESS_INFORMATION pi = { 0 };

CreateProcessA(NULL, "C:\\Windows\\System32\\notepad.exe", NULL, NULL,
               FALSE, CREATE_SUSPENDED, NULL, NULL, &si, &pi);

LPVOID remote = VirtualAllocEx(pi.hProcess, NULL, cb,
                               MEM_COMMIT | MEM_RESERVE,
                               PAGE_EXECUTE_READWRITE);
WriteProcessMemory(pi.hProcess, remote, payload, cb, NULL);

QueueUserAPC((PAPCFUNC)remote, pi.hThread, 0);

// Thread services its APC queue as part of initialization, *before*
// running the original entry point.
ResumeThread(pi.hThread);

Why it works: when a newly created thread starts, the kernel transitions into user mode through ntdll!LdrInitializeThunk, which performs internal alertable waits during loader work. Any user APC queued before ResumeThread is delivered during that early window — before the legitimate entry point runs.

This variant straddles two ATT&CK sub-techniques: it is APC injection (T1055.004) but it also resembles Thread Execution Hijacking (T1055.003) because the suspended-thread-then-redirect pattern is structurally the same primitive.


Flow diagram of the Early-Bird APC injection sequence showing CreateProcess in suspended state, memory staging, APC queuing, ResumeThread, and payload execution before the legitimate entry point
Early-Bird queues the APC before the main thread has executed a single instruction, exploiting the alertable waits inside LdrInitializeThunk.

9. Special User APCs (RS5+): Bypassing the Alertable Requirement

Starting with Windows 10 RS5, the kernel introduced Special User APCs. The key behavioural change: these APCs are delivered with Mode == KernelMode to force a thread signal. The thread is interrupted mid-execution to run the special APC — the alertable-state requirement is gone.

They are queued via NtQueueApcThreadEx (passing 1 as the reserve handle) or through NtQueueApcThreadEx2, which exposes a flags field. kernelbase!QueueUserAPC2 is the documented Win32 wrapper.

// Conceptual signatures — confirm flag values and syscall semantics
// against the target SDK / Windows build before relying on them.

typedef NTSTATUS (NTAPI *pNtQueueApcThreadEx2)(
    HANDLE         ThreadHandle,
    HANDLE         UserApcReserveHandle,   // optional reserve object
    ULONG          ApcFlags,               // e.g. QUEUE_USER_APC_FLAGS_SPECIAL_USER_APC
    PVOID          ApcRoutine,
    PVOID          SystemArgument1,
    PVOID          SystemArgument2,
    PVOID          SystemArgument3);

// Pseudocode dispatch — `Special User APC` interrupts a running thread
// without requiring it to be in SleepEx / WaitForSingleObjectEx.
pNtQueueApcThreadEx2 fn = (pNtQueueApcThreadEx2)
    GetProcAddress(GetModuleHandleW(L"ntdll.dll"), "NtQueueApcThreadEx2");

fn(hThread,
   NULL,
   QUEUE_USER_APC_FLAGS_SPECIAL_USER_APC,   // forces in-execution delivery
   remote_routine,
   NULL, NULL, NULL);

Internally the kernel sets SpecialUserApcPending (added to KAPC_STATE for this purpose) and arranges delivery at the next return-to-user-mode opportunity regardless of wait state. This is a meaningful escalation of the primitive — it converts APC injection from “wait until the thread cooperates” to “interrupt the thread now.”


10. Real-World Threat Actor Usage

APC injection is documented at the technique level rather than the family level here; defenders should treat it as a primitive that recurs across many tradecraft variants:

  • DOUBLEPULSAR used kernel-mode APC injection to redirect user-mode threads from a kernel implant.
  • Multiple commodity and APT families catalogued under MITRE T1055.004 employ classic user-APC injection against svchost.exe, explorer.exe, and other long-running hosts.
  • The AtomBombing family of injection variants combines GlobalAddAtom/NtQueueApcThread to stage code through atom tables, then dispatch via APC.
  • Recent research (Check Point’s Thread Name-Calling) chains thread-name primitives with APC dispatch to evade EDR userland hooks.

11. Common Attacker Techniques

TechniqueDescription
Classic APC InjectionOpenProcessOpenThread(THREAD_SET_CONTEXT)VirtualAllocExWriteProcessMemoryQueueUserAPC. Fires when the target thread next enters an alertable wait.
Early-Bird APCCreateProcess(CREATE_SUSPENDED) → write payload → QueueUserAPCResumeThread. APC fires during loader init, before the entry point.
Special User APCNtQueueApcThreadEx / NtQueueApcThreadEx2 with QUEUE_USER_APC_FLAGS_SPECIAL_USER_APC — interrupts the thread mid-execution; no alertable wait required.
Kernel APC injection from a driverMalicious driver calls KeInsertQueueApc directly against a user thread (DOUBLEPULSAR class). Mitigated by HVCI / driver signing.
Atom-table staged APC (AtomBombing)Payload bytes shuttled into target via atom tables, then dispatched with NtQueueApcThread. Evades naive memory-write detections.
Self-APC for unhooking / stagingQueue an APC to the current thread + SleepEx(0, TRUE) to execute code outside hooked call paths.

12. Defensive Strategies & Detection

APC injection is deliberately quiet — it does not create a remote thread and so does not emit Sysmon EID 8. Detection therefore pivots on the handle-acquisition and memory-staging stages, plus dedicated ETW.

12.1 Sysmon

Event IDNameWhy It Matters Here
EID 10ProcessAccessCaptures the OpenThread/OpenProcess step. GrantedAccess masks covering THREAD_SET_CONTEXT (0x0018) and PROCESS_VM_WRITE (0x0020) against high-value images are the strongest signal.
EID 8CreateRemoteThreadWill not fire for pure APC injection — but does fire for hybrid variants and is useful as a negative signal.
EID 1ProcessCreateDetects CREATE_SUSPENDED parent/child pairs typical of Early-Bird. Combine with short process lifetimes.

12.2 ETW — Microsoft-Windows-Threat-Intelligence

The Threat Intelligence ETW provider exposes a dedicated APC-injection sensor:

  • THREATINT_QUEUEUSERAPC_REMOTE_KERNEL_CALLER — logged by EtwTiLogInsertQueueUserApc / EtwTiLogQueueApcThread, invoked from inside KeInsertQueueApc. Introduced in Windows 10 build 1809.

Consumption requires a signed ELAM driver; the provider is reserved for AntiMalware-protected processes. In practice you receive this telemetry through your EDR vendor’s sensor.

12.3 Audit Policy

  • Enable Detailed Tracking → Audit Process Access → Security log EIDs 4656 / 4663 on handle requests. Filter for Object Type = Thread with access masks including THREAD_SET_CONTEXT.
  • Enable Audit Process Creation → EID 4688 with full command-line logging. Pair with CREATE_SUSPENDED heuristics where parent process behaviour permits inference.

12.4 Sigma Detection (Conceptual)

title: Suspicious Cross-Process Handle Acquisition Consistent With APC Injection
id: 00000000-0000-0000-0000-000000000000
status: experimental
logsource:
  product: windows
  service: sysmon
detection:
  selection_thread_ctx:
    EventID: 10
    GrantedAccess|contains:
      - '0x0018'    # THREAD_SET_CONTEXT | THREAD_GET_CONTEXT
      - '0x1fffff'  # PROCESS_ALL_ACCESS
    TargetImage|endswith:
      - '\lsass.exe'
      - '\svchost.exe'
      - '\explorer.exe'
      - '\winlogon.exe'
  selection_vm_write:
    EventID: 10
    GrantedAccess|contains: '0x0020'   # PROCESS_VM_WRITE
  timeframe: 5s
  condition: selection_thread_ctx and selection_vm_write
falsepositives:
  - Endpoint security products and legitimate debuggers
level: high

12.5 Behavioural Heuristics

The fingerprint that hunts well: VirtualAllocEx (RWX) → WriteProcessMemoryNtQueueApcThread issued by the same source process within a short window. Even when individual calls are noisy, the ordering is rare in benign software.

12.6 PowerShell — Hunt for Suspicious ProcessAccess Masks

Get-WinEvent -LogName 'Microsoft-Windows-Sysmon/Operational' -FilterXPath @"
*[System[EventID=10]]
"@ |
  Where-Object {
      $_.Properties[5].Value -match '0x0018|0x001f|0x1fffff' -and
      $_.Properties[6].Value -match 'lsass\.exe|svchost\.exe|winlogon\.exe'
  } |
  Select-Object TimeCreated,
                @{n='Source'; e={$_.Properties[4].Value}},
                @{n='Target'; e={$_.Properties[6].Value}},
                @{n='Access';e={$_.Properties[5].Value}}

12.7 Hardening

MitigationDescription
Protected Process Light (PPL)LSASS as PPL-Antimalware blocks OpenThread(THREAD_SET_CONTEXT) from untrusted callers.
Credential GuardMoves LSASS secrets into a VSM-isolated process, removing it as an APC target entirely.
HVCI / Code IntegrityPrevents unsigned kernel drivers from calling KeInsertQueueApc against arbitrary threads.
ASR rule 9e6c4e1f-7d60-472f-ba1a-a39ef669e4b0Blocks credential theft from LSASS; complements but does not directly block APC injection.
Minimize alertable waits in sensitive codeAvoid SleepEx(n, TRUE) and other alertable waits in privileged service threads unless required.
ETW-TI via EDRDeploy AV/EDR with an ELAM driver to consume Microsoft-Windows-Threat-Intelligence events in real time.

Graph diagram mapping four detection controls — Sysmon EID 10, ETW-TI, Audit EID 4656, and behavioural sequencing — plus hardening measures against the APC injection threat
Because APC injection skips CreateRemoteThread, detection pivots to handle-acquisition telemetry and dedicated ETW-TI sensors rather than Sysmon EID 8.

13. Tools for APC Analysis

ToolDescriptionLink
WinDbgWalk KTHREAD.ApcState, dump KAPC entries via !list, inspect UserApcPending.microsoft.com
Process HackerPer-thread inspection, including private RX allocations and thread call stacks indicative of injected code.processhacker.sourceforge.io
SysmonEID 10 / 8 / 1 telemetry for the handle-open and process-creation halves of the chain.sysinternals.com
Sysinternals handle.exeEnumerate handles a suspect process holds (look for foreign Thread / Process handles).sysinternals.com
Volatility 3Memory forensics: walk thread APC queues post-incident; identify injected RX regions.volatilityfoundation.org
ETW Explorer / SilkETWInspect or subscribe to ETW providers (ETW-TI requires signed ELAM).github.com
x64dbgUser-mode dynamic analysis of QueueUserAPC / RtlDispatchAPC call chains.x64dbg.com

14. MITRE ATT&CK Mapping

TechniqueMITRE IDDetection
Process InjectionT1055Behavioural sequence: cross-process handle with VM-write rights followed by APC queuing.
Process Injection: Asynchronous Procedure CallT1055.004Sysmon EID 10 with THREAD_SET_CONTEXT; ETW-TI THREATINT_QUEUEUSERAPC_REMOTE_KERNEL_CALLER.
Thread Execution HijackingT1055.003Early-Bird variant: CREATE_SUSPENDED process + THREAD_SET_CONTEXT handle + early-window APC.

T1055.004 is the primary mapping for this tutorial. The Early-Bird variant (§8) overlaps with T1055.003 because the suspended-thread + redirection structure is the same primitive — defenders should detect both.


Summary

  • APCs are a legitimate kernel facility for thread-targeted asynchronous work, and that property is exactly what makes them a first-class injection primitive.
  • The dispatch chain is QueueUserAPCNtQueueApcThreadKiInsertQueueApcKiDeliverApcntdll!RtlDispatchAPC → caller routine; every layer is observable.
  • User APCs require an alertable wait; Early-Bird sidesteps this via CREATE_SUSPENDED, and Special User APCs (NtQueueApcThreadEx2 + QUEUE_USER_APC_FLAGS_SPECIAL_USER_APC) eliminate the requirement entirely.
  • APC injection deliberately evades Sysmon EID 8 — detection pivots on EID 10 with THREAD_SET_CONTEXT (0x0018) and PROCESS_VM_WRITE (0x0020) against high-value targets, plus Microsoft-Windows-Threat-Intelligence ETW (EtwTiLogInsertQueueUserApc).
  • Map to T1055.004 for classic / special-user APC, and additionally to T1055.003 for the Early-Bird suspended-thread variant; harden with PPL, Credential Guard, HVCI, and ETW-TI-consuming EDR.

Related Tutorials

References

DPCs: Deferred Procedure Calls and Interrupt Deferral

Objective: Understand how the Windows kernel uses Deferred Procedure Calls (DPCs) to move work out of high-IRQL interrupt service routines down to DISPATCH_LEVEL, covering the KDPC structure, IRQL mechanics, the full queue-to-callback lifecycle, threaded and timer DPCs, the DPC watchdog, and how defenders detect kernel-mode abuse of the DPC mechanism.


1. The Interrupt Deferral Problem

When a hardware device raises an interrupt, the kernel dispatches to an Interrupt Service Routine (ISR) running at DIRQL — a device IRQL higher than the scheduler itself. At that level the processor cannot wait, cannot touch pageable memory, and blocks all lower-priority interrupts on that CPU. An ISR that lingers degrades the entire system; the guidance is that ISRs should not run longer than 25 microseconds.

Windows therefore uses a two-phase interrupt model. The ISR does the minimum work needed to quiesce the device (acknowledge the interrupt, snapshot status), then schedules a Deferred Procedure Call to perform the heavier processing later, at a gentler IRQL. The DPC executes at DISPATCH_LEVEL, which is still too high for anything that touches pageable memory — but it is low enough to run the bulk of device servicing without starving other interrupts.

The essence of the DPC is deferring execution to gentler circumstances. It is the kernel’s primary tool for keeping ISRs short.


2. IRQL Levels: A Precise Map

The Interrupt Request Level (IRQL) is a per-processor priority that determines what code may run and what it may do. Any routine running at DISPATCH_LEVEL or above is not preemptable, runs to completion, and must reside in non-paged memory.

IRQL NameValueNotes
PASSIVE_LEVEL0Normal user/kernel thread execution; paging and waiting allowed
APC_LEVEL1Asynchronous Procedure Calls
DISPATCH_LEVEL2DPC execution, scheduler, spin locks; no paging, no waiting
DIRQL3–11 (device-dependent)Hardware ISRs run here

An ISR at DIRQL cannot call functions that require PASSIVE_LEVEL. It instead schedules a DPC, which the kernel later runs at DISPATCH_LEVEL. Because DISPATCH_LEVEL still forbids page faults and blocking waits, a DPC routine and all data it touches must be non-paged.


Hierarchy diagram showing Windows IRQL levels from DIRQL at the top down through DISPATCH_LEVEL where DPCs run, APC_LEVEL, and PASSIVE_LEVEL at the bottom, with arrows showing how ISRs queue DPCs that drain at DISPATCH_LEVEL
The IRQL ladder: ISRs fire at DIRQL and defer heavy work via DPCs, which the kernel drains at DISPATCH_LEVEL before returning to lower IRQLs.

3. The KDPC Structure Dissected

The KDPC is the structure in which the kernel keeps the state of a Deferred Procedure Call. It has always been explicitly undocumented — Microsoft labels it an opaque structure and warns drivers not to set members directly. The published layout from WDK/OSR headers is:

typedef struct _KDPC {
    UCHAR                 Type;            // DpcObject or ThreadedDpcObject
    UCHAR                 Importance;      // Low / Medium / High
    USHORT                Number;          // target processor (directed DPCs)
    LIST_ENTRY            DpcListEntry;    // links into per-processor DPC queue
    PKDEFERRED_ROUTINE    DeferredRoutine; // pointer to the callback function
    PVOID                 DeferredContext; // driver-supplied context value
    PVOID                 SystemArgument1; // extra arg passed to callback
    PVOID                 SystemArgument2; // extra arg passed to callback
    __volatile PVOID      DpcData;         // internal; pointer to KDPC_DATA
} KDPC, *PKDPC, *PRKDPC;
FieldPurpose
TypeDistinguishes a normal DpcObject from a ThreadedDpcObject
ImportanceControls queue insertion: MediumImportance = tail, HighImportance = head
NumberTarget logical processor, set via KeSetTargetProcessorDpc
DeferredRoutinePointer to the KDEFERRED_ROUTINE callback
DeferredContextOpaque context the driver receives back in the callback
SystemArgument1/2Caller-supplied arguments passed through to the callback
DpcDataVolatile internal pointer to the per-processor KDPC_DATA; non-NULL while queued

The DpcData field is the kernel’s bookkeeping anchor: before Windows 8.1 it pointed directly at a KDPC_DATA structure, and its non-NULL state indicates the DPC is currently queued. Because DeferredRoutine is a raw function pointer inside a writable structure, it is also a corruption target — covered in §10.


4. The DPC Lifecycle: From ISR to Callback

A DPC moves through four stages: allocate → initialize → queue → drain.

API FunctionPurpose
KeInitializeDpcInitializes a KDPC, binding a DeferredRoutine and DeferredContext
KeInsertQueueDpcInserts the KDPC into the per-processor queue; returns FALSE if already queued
IoRequestDpcConvenience wrapper called from ISR context for the DpcForIsr pattern
KeRemoveQueueDpcRemoves a pending (not-yet-fired) DPC from the queue

Kernel code first allocates a KDPC in non-paged pool (or the device extension) so the object is resident when referenced from the ISR.

// C1 — allocate and initialize a DPC object
PKDPC pDpc = ExAllocatePool2(POOL_FLAG_NON_PAGED, sizeof(KDPC), 'cpDD');
if (pDpc) {
    KeInitializeDpc(pDpc, MyCustomDpc, DeviceContext);  // routine + context
}

The callback must match the KDEFERRED_ROUTINE signature and runs at DISPATCH_LEVEL:

// C2 — DPC callback stub
VOID MyCustomDpc(
    _In_     PKDPC Dpc,
    _In_opt_ PVOID DeferredContext,
    _In_opt_ PVOID SystemArgument1,
    _In_opt_ PVOID SystemArgument2)
{
    UNREFERENCED_PARAMETER(Dpc);
    ASSERT(KeGetCurrentIrql() == DISPATCH_LEVEL);   // invariant
    // Non-paged, bounded work only — no waits, no page faults.
}

The ISR queues the DPC. The return value of KeInsertQueueDpc enforces the single-instantiation guarantee: only one instance of a given KDPC can be queued at a time, so queuing it twice before it fires runs the routine once.

// C3 — queue from a mock ISR
BOOLEAN queued = KeInsertQueueDpc(pDpc, Arg1, Arg2);
if (!queued) {
    // Already pending on a queue — the earlier request still stands.
}

Device drivers commonly use the wrapper from inside their InterruptService routine:

// C4 — DpcForIsr pattern
BOOLEAN MyIsr(_In_ PKINTERRUPT Interrupt, _In_ PVOID Context) {
    PDEVICE_OBJECT devObj = (PDEVICE_OBJECT)Context;
    // ...acknowledge hardware quickly...
    IoRequestDpc(devObj, devObj->CurrentIrp, NULL);  // schedules DpcForIsr
    return TRUE;
}

When the processor returns from the interrupt, it checks its DPC queue; if entries are pending, the kernel raises IRQL to DISPATCH_LEVEL, drains the queue by invoking each DeferredRoutine, then lowers IRQL back down.


Flow diagram showing the four-stage DPC lifecycle: allocate KDPC in non-paged pool, initialize with KeInitializeDpc, ISR fires and calls KeInsertQueueDpc, then CPU drains the per-processor queue and executes the DeferredRoutine at DISPATCH_LEVEL
A DPC travels through four stages — allocate, initialize, queue, drain — with the single-instantiation guarantee ensuring each KDPC object fires at most once per queue cycle.

5. Per-Processor DPC Queues and KPRCB

Each logical processor owns a separate DPC queue, stored as a KDPC_DATA structure inside the processor’s KPRCB (Kernel Processor Control Block). This avoids cross-CPU locking on the common path.

KDPC_DATA carries the queue head, depth, count, and a spin lock:

typedef struct _KDPC_DATA {
    LIST_ENTRY DpcListHead;   // queued KDPC objects
    ULONG      DpcLock;       // spin lock protecting the list
    volatile ULONG DpcQueueDepth;  // pending DPCs
    ULONG      DpcCount;      // running total
} KDPC_DATA, *PKDPC_DATA;

Exact KDPC_DATA field names vary by kernel build — confirm against a live PDB with dt nt!_KDPC_DATA before relying on offsets.

Because each queue is per-processor, the target processor of a DPC determines which CPU drains it. By default a DPC runs on the CPU that queued it, but it can be pinned elsewhere (§6) — a property attackers exploit to manipulate specific cores.


Hierarchy diagram showing two CPU KPRCB blocks each owning an independent KDPC_DATA queue structure, with individual KDPC objects enqueued within each per-processor queue to avoid cross-CPU locking
Each logical processor maintains its own KDPC_DATA queue inside its KPRCB, eliminating cross-CPU lock contention on the common interrupt-deferral path.

6. Controlling DPC Behaviour

API FunctionPurpose
KeSetImportanceDpcSets Importance; HighImportance inserts at the queue head
KeSetTargetProcessorDpcPins the DPC to a specific logical processor (directed DPC)
KeRemoveQueueDpcDequeues a pending DPC; fails once the routine is already running

DPCs have three priority levels — low, medium, high. Importance influences KeInsertQueueDpc: high-importance DPCs go to the head of the queue and are serviced first.

A directed DPC is created by binding it to a CPU before queuing. The pattern below — iterating over KeNumberProcessors and targeting each core — is the same primitive a rootkit weaponizes for CPU lockdown, so treat it as an educational illustration only:

// C5 — directed DPC setup (educational pattern)
for (CCHAR cpu = 0; cpu < KeNumberProcessors; cpu++) {
    KeInitializeDpc(&pDpcArray[cpu], MyCustomDpc, NULL);
    KeSetTargetProcessorDpc(&pDpcArray[cpu], cpu);  // pin to logical CPU
    KeSetImportanceDpc(&pDpcArray[cpu], HighImportance);
}

Once a DPC begins executing it cannot be removed; KeRemoveQueueDpc only rescinds a still-pending entry.


7. Threaded DPCs

Since Windows Server 2003, a KDPC can represent either a normal DPC or a threaded DPC. In the threaded variant, the kernel — if it can arrange it — calls the routine back at PASSIVE_LEVEL from a highest-priority thread, allowing more flexible work. Support can be disabled, in which case the threaded DPC falls back to running at DISPATCH_LEVEL exactly like a normal DPC.

You initialize one with KeInitializeThreadedDpc and a CustomThreadedDpc routine. Because that routine can run at either PASSIVE_LEVEL or DISPATCH_LEVEL, it must synchronize correctly at both IRQLs:

// C7 — threaded DPC with dual-IRQL guard
KeInitializeThreadedDpc(&g_ThreadedDpc, MyThreadedDpc, NULL);

VOID MyThreadedDpc(_In_ PKDPC Dpc, _In_opt_ PVOID Ctx,
                   _In_opt_ PVOID A1, _In_opt_ PVOID A2) {
    ASSERT(KeGetCurrentIrql() <= DISPATCH_LEVEL);   // may be PASSIVE or DISPATCH
    // Use locks valid at both levels.
}

Threaded DPCs should be preferred over ordinary DPCs unless a particular DPC must never be preempted — not even by another DPC.


8. Timer DPCs and KTIMER

A DPC is also the callback mechanism for kernel timers. You associate a KDPC with a KTIMER and arm it; on expiry the kernel queues the DPC. KeSetTimerEx supports both one-shot and periodic timers.

// C6 — periodic timer DPC
KeInitializeTimerEx(&g_Timer, NotificationTimer);
KeInitializeDpc(&g_TimerDpc, MyCustomDpc, NULL);

LARGE_INTEGER due;
due.QuadPart = -10LL * 1000 * 1000;     // 1 second, relative
KeSetTimerEx(&g_Timer, due, 1000 /* ms period */, &g_TimerDpc);

Windows uses special timer DPCs internally for timer expiration and context switching. The same primitive — a recurring timer pointed at a non-paged callback — is the cleanest way a driver schedules background work, and the cleanest way a malicious driver re-enters its payload (§10).


9. The DPC Watchdog and Debugging

The kernel runs a DPC watchdog. Bug Check 0x00000133 (DPC_WATCHDOG_VIOLATION) fires when the watchdog detects either a single long-running DPC or a prolonged time spent at DISPATCH_LEVEL or above. The timing budgets are 100 microseconds for a DPC and 25 microseconds for an ISR. A malicious DPC spin-loop can therefore inadvertently trip the watchdog and crash the host.

Inspect live DPC state in the kernel debugger:

kd> !dpcs                 ; list pending DPCs per processor
kd> dt nt!_KDPC           ; KDPC layout for this build
kd> dt nt!_KDPC_DATA      ; per-processor queue structure
kd> !prcb                 ; processor control block (contains DpcData)
kd> !pcr                  ; processor control region

!dpcs reveals each queued DPC’s DeferredRoutine address — the single most useful artifact, since an unknown or non-image-backed routine address is a strong anomaly.


10. Common Attacker Techniques

DPCs give kernel-mode malware a high-IRQL execution surface. Because code at DISPATCH_LEVEL is non-preemptable and runs to completion, it is ideal cover for Direct Kernel Object Manipulation (DKOM).

TechniqueDescription
CPU lockdown / freeze-other-CPUsQueue a directed KDPC to every non-current CPU via KeSetTargetProcessorDpc and spin, raising all secondary cores to DISPATCH_LEVEL to block interruption during a DKOM patch
Timer DPC payloadArm a KTIMER whose DeferredRoutine points at attacker-controlled non-paged code, for recurring stealth execution
KDPC hijackingOverwrite DeferredRoutine in a legitimate queued KDPC to redirect execution to a payload
Driver-based persistenceLoad a malicious signed/BYOVD driver that registers a recurring timer DPC at load time

The CPU-lockdown pattern is especially relevant to defenders: by parking every other core at DISPATCH_LEVEL, the rootkit can unlink processes, patch EDR callbacks, or hide drivers while no scheduler or AV thread can run.


Graph diagram mapping three rootkit DPC abuse techniques — directed DPC CPU lockdown, timer DPC stealth re-entry, and DeferredRoutine pointer corruption — to their downstream impacts of DKOM manipulation and EDR callback patching
Kernel rootkits weaponize DPCs three ways: CPU lockdown via directed DPCs, persistent re-entry via timer DPCs, and code hijacking via DeferredRoutine pointer corruption.

11. Defensive Strategies & Detection

DPC objects live entirely in kernel memory and are not directly observable from user mode, so detection focuses on the driver that installs them and on kernel ETW timing telemetry.

Sysmon and Windows event telemetry:

Event IDSourceRelevance
6Sysmon — Driver LoadedFires on every driver load; primary signal for kernel modules that register DPC routines
7Sysmon — Image LoadedCatches unsigned/anomalous modules entering kernel space
7045Service Control ManagerNew kernel-mode driver, especially from a non-standard path
7040Service Control ManagerService start-type change — driver persistence

ETW providers: The NT Kernel Logger session with EVENT_TRACE_FLAG_DPC and EVENT_TRACE_FLAG_INTERRUPT records per-DPC timing and the routine address, exposing abnormally long-running or unknown-address DPC routines. Microsoft-Windows-Kernel-Processor-Power surfaces IRQL/watchdog events. Verify the exact flag constants against the current WDK evntrace.h.

Sigma anchor — unsigned/expired driver load:

title: Suspicious Kernel Driver Load (Unsigned or Expired)
logsource:
  product: windows
  service: sysmon
detection:
  selection_unsigned:
    EventID: 6
    Signed: 'false'
  selection_expired:
    EventID: 6
    SignatureStatus: 'Expired'
  selection_path:
    EventID: 6
    ImageLoaded|contains: '\Temp\'
  condition: selection_unsigned or selection_expired or selection_path
level: high

Hunt additionally for EventID 6 where ImageLoaded resolves outside \SystemRoot\System32\drivers\.

Hardening:

MitigationDescription
Driver Signature Enforcement (DSE)Default on 64-bit Windows; blocks unsigned drivers that would install DPC routines
HVCIProtects kernel code pages, raising the bar for DPC shellcode and DeferredRoutine overwrite
Kernel CETHardware shadow stack mitigates ROP-based DPC hijacking
DPC WatchdogBuilt-in; Bug Check 0x133 catches long-running DPC loops, including malicious spin-locks
Vulnerable Driver BlocklistHKLM\SYSTEM\CurrentControlSet\Control\CI\Config\VulnerableDriverBlocklistEnable blocks known BYOVD primitives
WDAC / Memory IntegrityRestrict which drivers may load, shrinking the DPC-abuse attack surface

12. Tools for DPC Analysis

ToolDescriptionLink
WinDbg!dpcs, dt nt!_KDPC, !prcb, !pcr live queue inspectionmicrosoft.com
Process HackerDriver/service enumeration and kernel module listingprocesshacker.sourceforge.io
Windows Performance Recorder / xperfCaptures DPC/ISR ETW timing and routine addressesmicrosoft.com
SysmonDriver-load (EID 6) and image-load (EID 7) telemetrysysinternals.com
VolatilityMemory-forensic enumeration of drivers and kernel callbacksvolatilityfoundation.org
GhidraStatic analysis of suspect drivers for KeInsertQueueDpc usageghidra-sre.org

13. MITRE ATT&CK Mapping

TechniqueMITRE IDDetection
RootkitT1014ETW DPC routine-address anomalies; !dpcs unknown routines
Boot/Logon Autostart: Kernel ModulesT1547.006Sysmon EID 6 / Event 7045 driver loads
Exploitation for Privilege EscalationT1068HVCI/CET violations; KDPC.DeferredRoutine corruption
Impair Defenses: Disable/Modify ToolsT1562.001CPU-freeze DPC pattern halting EDR threads; watchdog 0x133
Native APIT1106Driver use of KeInitializeDpc / KeInsertQueueDpc

No dedicated ATT&CK sub-technique exists for DPC abuse as of ATT&CK v15; the techniques above are the parents. Verify current IDs at attack.mitre.org before publishing.


Summary

  • DPCs are the kernel’s mechanism for deferring interrupt work from high-IRQL ISRs down to DISPATCH_LEVEL, keeping ISRs under their 25 µs budget.
  • The opaque KDPC structure carries the DeferredRoutine, context, arguments, and a DpcData pointer that marks whether it is queued on a per-processor KDPC_DATA list in the KPRCB.
  • The lifecycle runs allocate → KeInitializeDpcKeInsertQueueDpc/IoRequestDpc → per-CPU drain at DISPATCH_LEVEL, with a single-instantiation guarantee per object.
  • Rootkits abuse directed DPCs for CPU lockdown, timer DPCs for stealth re-entry, and DeferredRoutine corruption for hijacking — mapping to T1014, T1547.006, and T1562.001.
  • Detect via Sysmon Event ID 6 driver loads, NT Kernel Logger DPC timing telemetry, and the DPC watchdog (0x133); harden with DSE, HVCI, Kernel CET, and the vulnerable driver blocklist.

Related Tutorials

References

IRQL Levels: Interrupt Request Priorities Explained

Objective: Understand the Windows kernel’s Interrupt Request Level (IRQL) priority system — what each level means numerically and symbolically, how the HAL arbitrates hardware and software interrupts, which APIs query and change the IRQL, what kernel operations are legal at each level, and how malicious kernel code abuses IRQL semantics to evade defenders.


1. What Is an IRQL?

An Interrupt Request Level (IRQL) is a per-processor priority value that determines which kernel-mode support routines the currently executing code may legally call. It is an integer in the range 0–31, stored as type KIRQL (a typedef for UCHAR). Three levels — PASSIVE_LEVEL, APC_LEVEL, and DISPATCH_LEVEL — are referred to symbolically; the rest are usually named by value.

IRQL is per-processor, not per-thread. On x86 it lives in the Irql field of the _KPCR (Kernel Processor Control Region); on x64 it is mapped to the CR8 register (Task Priority Register). When the processor raises its IRQL, all interrupts at or below that level are masked. Higher-numbered interrupts preempt all lower-IRQL processing; once handled, the processor returns to the previous level. Raising and lowering must follow strict stack discipline — you only lower back to a level you previously raised from.


2. The IRQL Hierarchy

The Hardware Abstraction Layer (HAL) maps physical interrupt vectors to software IRQLs. The count of levels is architecture-dependent: x64 and Itanium expose 16 IRQLs; x86 exposes 32, owing to differences in interrupt-controller hardware. The canonical wdm.h symbolic definitions differ across architectures.

Symbolic Namex64 Valuex86 ValueDescription
PASSIVE_LEVEL / LOW_LEVEL00Normal thread execution; nothing masked
APC_LEVEL11APC delivery and page-fault handling
DISPATCH_LEVEL22Thread scheduler / DPC queue
CMC_LEVEL3Correctable Machine Check
Device IRQLs (DIRQL)4–113–26Hardware device interrupts
CLOCK_LEVEL1328System clock timer
IPI_LEVEL / DRS_LEVEL1429Inter-Processor Interrupt
POWER_LEVEL1530Power failure
PROFILE_LEVEL / HIGH_LEVEL1531Profiling / highest maskable

Higher value = higher priority. A device interrupt at DIRQL 8 preempts a DPC at DISPATCH_LEVEL (2), which itself preempts ordinary thread code at PASSIVE_LEVEL (0).


Hierarchical diagram showing Windows IRQL levels from HIGH_LEVEL at the top down to PASSIVE_LEVEL at the bottom, colour-coded by hardware versus software IRQLs
Windows x64 IRQL hierarchy: higher-numbered levels preempt all lower ones, with software IRQLs at the base and hardware interrupt levels at the top.

3. Software IRQLs: PASSIVE, APC, and DISPATCH

The lowest three levels are software IRQLs — the kernel raises and lowers them without involving the interrupt controller.

PASSIVE_LEVEL (0) masks nothing. This is where normal kernel-mode thread code runs: DriverEntry, AddDevice, Unload, most dispatch routines, and driver-created worker threads. All blocking, paging, and synchronization primitives are available.

APC_LEVEL (1) masks Asynchronous Procedure Call interrupts only. The sole functional difference from PASSIVE_LEVEL is that APCs cannot interrupt the running code. Both levels imply a valid thread context and both permit access to pageable memory. Page-fault handling itself runs at APC_LEVEL.

DISPATCH_LEVEL (2) masks DISPATCH_LEVEL and APC_LEVEL. Critically, the thread scheduler is disabled — code here owns the processor until it lowers IRQL. Routines such as StartIo, DpcForIsr, IoTimer, Cancel (holding the cancel spin lock), and all DPC callbacks run here. Two hard rules apply: no access to paged memory, and no blocking waits.

FeaturePASSIVE_LEVELAPC_LEVELDISPATCH_LEVEL
Thread contextYesYesNot guaranteed
Scheduler activeYesYesNo
Paged pool accessYesYesNo
Blocking waits allowedYesYesNo

4. Hardware IRQLs: DIRQL and Above

Levels at or above the device range are hardware IRQLs driven by the interrupt controller. A driver’s Device IRQL (DIRQL) is the SynchronizeIrql stored in its _KINTERRUPT object. When a device fires, the processor raises to that DIRQL and invokes the Interrupt Service Routine (ISR), a KSERVICE_ROUTINE.

At DIRQL, all interrupts at or below the driver’s level are masked, but higher-DIRQL devices, the clock, and power-failure interrupts may still preempt. Because the scheduler and lower-priority interrupts are blocked, ISRs must be minimal — they acknowledge the hardware, capture volatile state, and queue a DPC for the heavy lifting at DISPATCH_LEVEL.

Above DIRQL sit CLOCK_LEVEL, IPI_LEVEL (used by one processor to interrupt another), POWER_LEVEL, and HIGH_LEVEL. The general principle: the higher the IRQL, the shorter the code must run. Sustained work at high IRQL starves the entire processor.

// KSERVICE_ROUTINE - runs at DIRQL; must be minimal
BOOLEAN MyInterruptServiceRoutine(
    PKINTERRUPT Interrupt, PVOID ServiceContext) {
    // Acknowledge hardware, then defer heavy work to a DPC.
    // Do NOT touch paged memory here.
    IoRequestDpc(MyDeviceObject, MyDeviceObject->CurrentIrp, ServiceContext);
    return TRUE;
}

5. Kernel APIs for IRQL Management

Drivers query and adjust IRQL through a small, exported API surface in wdm.h.

API FunctionPurpose
KeGetCurrentIrql()Returns the current processor IRQL; callable at any IRQL
KeRaiseIrql(NewIrql, &OldIrql)Raises to NewIrql; saves prior level. NewIrql must be ≥ current
KeLowerIrql(OldIrql)Restores a previously saved IRQL — only after a matching raise
KeRaiseIrqlToDpcLevel()Raises to DISPATCH_LEVEL, returns old IRQL
KeAcquireSpinLock(&Lock, &OldIrql)Acquires spin lock, raising to DISPATCH_LEVEL
KeReleaseSpinLock(&Lock, OldIrql)Releases lock, restoring saved IRQL
KeAcquireSpinLockAtDpcLevel(&Lock)Acquires lock without raising (caller already at DISPATCH_LEVEL)

The exact signatures:

KIRQL KeGetCurrentIrql(void);

void KeRaiseIrql(
  _In_  KIRQL  NewIrql,
  _Out_ PKIRQL OldIrql
);

void KeLowerIrql(_In_ KIRQL NewIrql);   // restore saved old IRQL

KIRQL KeRaiseIrqlToDpcLevel(void);

The raise/lower discipline is enforced: calling KeRaiseIrql with a value lower than the current IRQL is a fatal error, and KeLowerIrql may only restore the level a prior KeRaiseIrql saved.

// Demonstrates the raise/lower stack discipline
VOID MyFunctionNeedingDispatchLevel(VOID) {
    KIRQL oldIrql;
    KeRaiseIrql(DISPATCH_LEVEL, &oldIrql);
    // --- Critical section: no paged pool access here ---
    KeLowerIrql(oldIrql);
}

Spin locks couple mutual exclusion with IRQL: acquiring one raises to DISPATCH_LEVEL so the holder cannot be preempted by the scheduler on its processor.

KSPIN_LOCK MySpinLock;
KIRQL oldIrql;

KeInitializeSpinLock(&MySpinLock);
// KeAcquireSpinLock raises to DISPATCH_LEVEL internally
KeAcquireSpinLock(&MySpinLock, &oldIrql);
// ... protected shared-data access (non-paged only) ...
KeReleaseSpinLock(&MySpinLock, oldIrql); // restores oldIrql

A driver inspecting its own context queries the level directly:

// Demonstrates KeGetCurrentIrql() usage and KIRQL type
NTSTATUS DriverDispatchCreate(PDEVICE_OBJECT DeviceObject, PIRP Irp) {
    KIRQL currentIrql = KeGetCurrentIrql();
    // Expected: PASSIVE_LEVEL (0) in a dispatch routine
    DbgPrint("[MyDriver] Current IRQL: %u\n", (ULONG)currentIrql);
    // ...complete IRP...
}

6. Memory Access Rules at Each IRQL

The single most consequential IRQL rule concerns paged memory. Any routine running above APC_LEVEL that touches paged pool causes a fatal page fault. Resolving a page fault requires the file-system driver to read from disk — an operation that needs a context switch, which is impossible once the scheduler is disabled at DISPATCH_LEVEL.

Memory PoolPASSIVE_LEVELAPC_LEVELDISPATCH_LEVEL+
Paged poolAccessibleAccessibleFatal page fault
Non-paged poolAccessibleAccessibleAccessible

Code at or above DISPATCH_LEVEL must therefore allocate from non-paged pool and operate only on locked or non-pageable memory (for example, buffers locked with MmProbeAndLockPages). Violating this rule produces the most common driver bug check — IRQL_NOT_LESS_OR_EQUAL (0x0000000A), or its driver-attributed variant 0x000000D1.


7. DPCs: The DISPATCH_LEVEL Workhorses

A Deferred Procedure Call (DPC) moves work out of the time-critical ISR into DISPATCH_LEVEL. The ISR queues a _KDPC object (via IoRequestDpc or KeInsertQueueDpc); the kernel drains the DPC queue as IRQL drops below DISPATCH_LEVEL. DpcForIsr handles per-IRP completion; CustomDpc and CustomTimerDpc serve driver-specific needs.

// KDEFERRED_ROUTINE - runs at DISPATCH_LEVEL
VOID MyDpcRoutine(
    PKDPC Dpc, PVOID DeferredContext,
    PVOID SystemArgument1, PVOID SystemArgument2) {
    // Safe: non-paged pool only.
    // Do NOT call KeWaitForSingleObject with a nonzero timeout.
    DbgPrint("[MyDpc] Running at DISPATCH_LEVEL\n");
}

A DPC that runs too long throttles the whole system and triggers DPC_WATCHDOG_VIOLATION (0x00000133) once sustained execution exceeds the watchdog threshold.


Flow diagram illustrating the handoff from a hardware interrupt through the ISR at DIRQL to a queued DPC callback executing at DISPATCH_LEVEL 2
ISRs acknowledge hardware and queue a DPC object; the kernel drains DPC queues at DISPATCH_LEVEL so heavy processing never blocks critical interrupt handling.

8. APCs: The APC_LEVEL Mechanism

An Asynchronous Procedure Call (APC) executes a function in the context of a specific thread. Kernel APCs run at APC_LEVEL; user APCs are delivered when a thread returns to PASSIVE_LEVEL in a user-mode alertable wait. Drivers initialize them with KeInitializeApc and queue them with KeInsertQueueApc. Because APC_LEVEL still implies a valid thread context and permits paged access, certain dispatch routines raise to APC_LEVEL to serialize against APC delivery while remaining able to page in data.


9. Debugging IRQL With WinDbg

WinDbg exposes IRQL state on both live kernels and crash dumps.

; Check current IRQL on each processor
!irql

; Examine the KPCR for processor 0
!pcr 0

; List pending DPCs
!dpcs

; Analyze a 0x0000000A bugcheck
!analyze -v

On x64 the IRQL is the CR8 register; you can read it and the _KPCR directly:

; dt = display type; shows _KPCR struct at GS base
dt nt!_KPCR @$pcr
; On x64, IRQL maps to CR8 (Task Priority Register)
r cr8

The IRQL contract is also expressed statically through SAL annotations in wdm.h, which static-analysis tooling verifies at build time:

// Illustrates IRQL annotation macros from wdm.h
_IRQL_requires_max_(DISPATCH_LEVEL)
VOID MyRoutineSafeAtOrBelowDispatch(VOID);

_IRQL_requires_(PASSIVE_LEVEL)
VOID MyRoutineRequiresPassive(VOID);

_IRQL_raises_(DISPATCH_LEVEL)
_IRQL_saves_
KIRQL MyRaiseRoutine(VOID);

10. IRQL in a Security Context

IRQL semantics become a security concern the moment attacker code reaches ring 0. Code running at DISPATCH_LEVEL owns its processor and is invisible to user-mode EDR hooks — an ideal vantage point for unhooking the SSDT, overwriting kernel callbacks, or hiding objects before defensive software can react. Because paged access above APC_LEVEL is fatal, IRQL violations also serve as a crude denial-of-service primitive: a single bad page touch produces an IRQL_NOT_LESS_OR_EQUAL blue screen.

The dominant delivery vector is Bring Your Own Vulnerable Driver (BYOVD) — loading a legitimately signed but exploitable driver to obtain kernel-IRQL execution without writing a new signed driver. Missing or incorrect IRQL SAL annotations frequently mask the very bugs these attacks exploit.


Flow diagram showing a BYOVD attack path from loading a vulnerable signed driver through raising IRQL to DISPATCH_LEVEL to bypass EDR hooks or trigger a denial-of-service blue screen
Attackers exploit IRQL semantics via BYOVD: owning the processor at DISPATCH_LEVEL lets them silently unhook defenses or weaponize paged-memory violations as a kernel-mode DoS.

11. Common Attacker Techniques

TechniqueDescription
BYOVD kernel executionLoad a signed-but-vulnerable driver (e.g. RTCore64.sys, dbutil_2_3.sys) to run code at kernel IRQL
EDR unhooking at DISPATCH_LEVELPatch SSDT entries or kernel callbacks while the scheduler is disabled, beating re-hook races
Rootkit concealmentHide processes, files, and connections from DIRQL/DISPATCH_LEVEL, below user-mode visibility
Spin-lock starvationHold a spin lock at DISPATCH_LEVEL to monopolize a processor — driver-stack DoS
Deliberate IRQL faultForce paged access above APC_LEVEL to bug-check the host (0x0000000A DoS)
DSE downgradeFlip test-signing or pre-release flags to load unsigned kernel code

12. Defensive Strategies & Detection

Driver loads are the chokepoint. Sysmon Event ID 6 (Driver Loaded) records ImageLoaded, Hashes, Signed, Signature, and SignatureStatus — the fields that expose unsigned or anomalously signed drivers and known-vulnerable BYOVD payloads. Event ID 7045 (and System log 7036/7040) surface drivers registered as services. PatchGuard violations of _KPCR/IDT/SSDT raise bug check 0x00000109 (CRITICAL_STRUCTURE_CORRUPTION); HVCI/Code-Integrity blocks land in Microsoft-Windows-CodeIntegrity/Operational (Event IDs 3001–3089) and Security Event ID 5038.

A starting Sigma rule for vulnerable-driver loads:

title: Suspicious Vulnerable Driver Load (Possible BYOVD)
logsource:
  product: windows
  service: sysmon
detection:
  selection_unsigned:
    EventID: 6
    Signed: 'false'
  selection_known_vuln:
    EventID: 6
    ImageLoaded|endswith:
      - '\RTCore64.sys'
      - '\dbutil_2_3.sys'
  condition: selection_unsigned or selection_known_vuln
level: high

ISR/DPC behavior can be traced through the NT Kernel Logger ETW provider with interrupt and DPC flags enabled:

xperf -on Base+Interrupt+DPC
xperf -d trace.etl

Hardening layers: enforce Driver Signature Enforcement and HVCI (M1048) so unsigned or tampered drivers cannot load even on a compromised kernel; enable the Microsoft Vulnerable Driver Blocklist (HKLM\SYSTEM\CurrentControlSet\Control\CI\Config\VulnerableDriverBlocklistEnable); restrict SeLoadDriverPrivilege to administrators (M1026); and run suspect drivers under Driver Verifier in a VM to force IRQL checks. Monitor bcdedit test-signing changes and the CI\Config registry path for downgrade attempts.

MITRE ATT&CK Mapping

TechniqueMITRE IDDetection
RootkitT1014Sysmon EID 6 unsigned/anomalous drivers; HVCI logs
Create System Process: ServiceT1543.003EID 7045 / System 7036 driver-service install
Impair Defenses: Disable ToolsT1562.001EDR callback integrity, PatchGuard 0x109
Impair Defenses: DowngradeT1562.010CI\Config registry + bcdedit test-signing audit
Exploitation for Priv-EscT1068BYOVD load (EID 6) preceding kernel-write activity
Escape to HostT1611Kernel-IRQL execution from container context

13. Tools for IRQL Analysis

ToolDescriptionLink
WinDbg!irql, !pcr, !dpcs, !analyze -v on bug checksmicrosoft.com
Driver VerifierForces IRQL/pool/deadlock checks on a target drivermicrosoft.com
SysmonDriver-load (EID 6) and service (7045) telemetrymicrosoft.com
xperf / WPAETW interrupt and DPC tracingmicrosoft.com
Process HackerLive driver and kernel-module enumerationprocesshacker.sourceforge.io
VolatilityMemory-forensic driver and callback inspectionvolatilityfoundation.org
GhidraStatic analysis of suspect driver binariesghidra-sre.org

Summary

  • IRQL is a per-processor priority register that gates which kernel routines code may legally call and which interrupts are masked.
  • The HAL maps hardware vectors onto 16 IRQLs on x64 and 32 on x86; higher value preempts lower, and raising/lowering must follow strict stack discipline.
  • Above APC_LEVEL the scheduler is disabled and paged memory is off-limits — touching it triggers IRQL_NOT_LESS_OR_EQUAL (0x0000000A).
  • Attackers reach kernel IRQL through BYOVD to unhook EDR, conceal rootkits, or bug-check the host as a DoS — mapped to T1014, T1543.003, T1562.001, and T1068.
  • Detect via Sysmon Event ID 6, the vulnerable-driver blocklist, HVCI/DSE enforcement, and SeLoadDriverPrivilege restriction.

Related Tutorials

References

System Calls and SSDT: How User Mode Reaches the Kernel

Objective: Understand how Windows user-mode code transitions to ring 0 via the SYSCALL instruction, how the System Service Descriptor Table (SSDT) dispatches those calls, and why SSDT hooking, direct syscalls, and modern kernel hardening (PatchGuard, HVCI, MWTI ETW) are central to both offensive tradecraft and defensive telemetry.


1. Why System Calls Exist

User-mode code runs at CPL 3 (ring 3). The kernel runs at CPL 0 (ring 0). Privileged operations — opening another process, mapping physical pages, accessing the file system, talking to drivers — require ring 0. The CPU enforces this with segment descriptors and page-table permissions; a direct CALL into kernel memory from user mode faults immediately.

The bridge is a controlled transition: the user-mode side specifies what it wants by number, the CPU switches to ring 0 at a fixed, kernel-controlled entry point, and the kernel validates and dispatches. That number is the System Service Number (SSN), and the dispatch table is the SSDT.

This design has two consequences that drive everything in this post:

  • The kernel entry point is fixed and well-known, so an attacker who can write to ring 0 memory (a kernel rootkit) can redirect every syscall by patching one table.
  • The user-mode side of the syscall (the stub in ntdll.dll) is not privileged, so an EDR can hook it — and a red teamer can bypass that hook by issuing the SYSCALL instruction themselves.

2. The Mechanics of SYSCALL on x64

SYSCALL is a dedicated x86-64 instruction designed for fast ring-3 → ring-0 transitions. It does not use the legacy interrupt gate (int 2Eh); it reads MSRs and jumps.

MSRAddressRole
IA32_LSTAR0xC0000082Kernel RIP to jump to on SYSCALL from 64-bit user mode. Holds KiSystemCall64 (or KiSystemCall64Shadow with KPTI).
IA32_STAR0xC0000081Encodes the kernel and user CS/SS selectors for SYSCALL/SYSRET.
IA32_FMASK0xC0000084RFLAGS mask — bits cleared on entry (notably IF, masking interrupts during the prologue).

The x64 Windows syscall ABI:

  • EAX holds the SSN (the index into KiServiceTable).
  • R10 holds the first argument. The user-mode stub copies RCX into R10 because SYSCALL itself clobbers RCX with the return RIP.
  • RDX, R8, R9, then stack — match the standard x64 calling convention for the remaining arguments.

A minimal user-mode stub, exactly as ntdll lays it out:

; NtFooBar — illustrative ntdll-style syscall stub (x64)
NtFooBar:
    mov   r10, rcx          ; SYSCALL clobbers RCX; preserve arg0 in R10
    mov   eax, 0x????       ; SSN — VERSION-SPECIFIC, resolve at runtime
    syscall                 ; ring-3 -> ring-0 via LSTAR
    ret                     ; SYSRET returns here

The 32-bit predecessor was SYSENTER (with entry stored in IA32_SYSENTER_EIP). On modern 64-bit Windows, SYSENTER is only relevant inside the Wow64 path.


Flow diagram showing the sequence from user-mode code through the ntdll SYSCALL stub, CPU MSR-driven transition, KiSystemCall64 kernel entry point, SSDT dispatch, and final Nt* function execution
A single SYSCALL instruction bridges ring 3 and ring 0, with EAX carrying the SSN that indexes KiServiceTable for dispatch.

3. KiSystemCall64: The Kernel Entry Point

When the CPU executes SYSCALL from user mode:

  1. It loads RIP from IA32_LSTAR (→ KiSystemCall64).
  2. It loads CS/SS from IA32_STAR (kernel selectors).
  3. It saves the old user RIP in RCX and old RFLAGS in R11.
  4. It clears RFLAGS bits per IA32_FMASK.

KiSystemCall64 then:

  • Swaps GS via SWAPGS to access the per-CPU KPCR.
  • Switches from the user stack to the kernel stack stored in the KPCR.
  • Builds a KTRAP_FRAME capturing the user context.
  • Indexes KeServiceDescriptorTable (or the Shadow variant for Win32k GUI calls) using EAX.
  • Calls the resolved Nt* function.
  • On return, restores the frame and executes SYSRET to drop back to ring 3.

Selected KTRAP_FRAME fields (see WDK wdm.h for the full layout):

FieldDescription
RipSaved user-mode instruction pointer (from RCX at entry).
RspSaved user-mode stack pointer.
EFlagsSaved RFLAGS (from R11).
ErrCodeProcessor error code; 0 for syscalls.

With Kernel Page-Table Isolation (KPTI) active, IA32_LSTAR points instead at KiSystemCall64Shadow, a thin trampoline that swaps from the user CR3 (which maps only a minimal kernel trampoline) to the full kernel CR3 before falling through into the normal dispatcher. This is the Meltdown mitigation.


4. The SSDT and KSERVICE_TABLE_DESCRIPTOR

The “SSDT” in casual use refers to two related objects:

SymbolDescription
KeServiceDescriptorTableExported KSERVICE_TABLE_DESCRIPTOR. Covers the core Nt* services in ntoskrnl.exe.
KeServiceDescriptorTableShadowNot exported. Adds a second entry for win32k!W32pServiceTable — the GUI/USER/GDI syscall surface. Rootkits historically located it by pattern scanning around KeAddSystemServiceTable or via debugger symbols.
KiServiceTableThe actual function-pointer table referenced by the descriptor.
KiArgumentTableParallel array of argument byte counts per service.

Approximate layout from public symbols:

typedef struct _KSERVICE_TABLE_DESCRIPTOR {
    PULONG_PTR ServiceTable;   // -> KiServiceTable (encoded offsets on x64)
    PULONG     CounterTable;   // call counters (typically NULL in retail)
    ULONG      TableSize;      // number of services
    PUCHAR     ArgumentTable;  // bytes of stack args per service
} KSERVICE_TABLE_DESCRIPTOR, *PKSERVICE_TABLE_DESCRIPTOR;

The SSN (EAX) is split: the low 12 bits index the table, and bit 12 selects which descriptor — 0 for KeServiceDescriptorTable, 1 for the Win32k shadow table. This is how GUI syscalls (NtUserCreateWindowEx, NtGdiBitBlt, …) coexist with kernel-proper syscalls in the same SSN space.


Hierarchy diagram showing KeServiceDescriptorTable splitting into the core NT KiServiceTable and the Win32k shadow table, with EAX bit 12 selecting the descriptor and low 12 bits indexing into it
EAX bit 12 routes GUI syscalls to the Win32k shadow table while bits 11–0 index the specific service within the selected descriptor.

5. The x64 Encoded-Offset Format

A critical detail anyone writing an SSDT scanner gets wrong the first time: on x64 Windows, KiServiceTable entries are not function pointers. Each entry is a 32-bit value encoding a signed offset from the base of KiServiceTable itself, with the low 4 bits used to communicate the argument-count category to the dispatcher.

The decode is:

// Recover the real Nt* function address from KiServiceTable[i]
ULONG_PTR DecodeSsdtEntry(PULONG ServiceTable, ULONG index)
{
    LONG  encoded = (LONG)ServiceTable[index];     // signed 32-bit
    LONG  offset  = encoded >> 4;                  // arithmetic shift
    return (ULONG_PTR)ServiceTable + offset;       // base + offset
}

The arithmetic right shift matters — it preserves the sign, allowing functions located before KiServiceTable in memory to be addressed. A naive unsigned >> 4 will silently miss those entries and produce a corrupt scanner.


6. Tracing a Syscall End-to-End: NtOpenProcess

Following an OpenProcess call from a user-mode debugger target:

kernel32!OpenProcess
   └─> kernelbase!OpenProcess
        └─> ntdll!NtOpenProcess         ; the syscall stub
              mov  r10, rcx
              mov  eax, <SSN>           ; version-specific
              syscall
              ret
            ─────────── ring 3 / ring 0 boundary ───────────
            CPU: RIP <- LSTAR (KiSystemCall64[Shadow])
        nt!KiSystemCall64
          ├─ SWAPGS, switch to kernel stack
          ├─ build KTRAP_FRAME
          ├─ idx = EAX & 0xFFF
          ├─ desc = (EAX & 0x1000) ? Shadow : KeServiceDescriptorTable
          ├─ fn  = desc->ServiceTable + (desc->ServiceTable[idx] >> 4)
          └─ call nt!NtOpenProcess
                nt!NtOpenProcess
                  ├─ ObReferenceObjectByName / ByHandle
                  ├─ SeAccessCheck (DesiredAccess vs token)
                  └─ ObOpenObjectByPointer -> HANDLE
            SYSRET back to user-mode RIP saved in RCX

The SSN for NtOpenProcess changes between Windows builds; never hardcode it. Tooling either resolves it from the on-disk ntdll.dll, parses the in-memory stub, or consults a versioned table such as j00ru’s syscall reference.

A practical SSN extractor parses the Nt* export’s first instructions and reads the MOV EAX, imm32 (B8 xx xx xx xx) byte pattern:

# Parse SSNs from a clean on-disk ntdll.dll (illustrative)
import pefile, struct

pe = pefile.PE(r"C:\Windows\System32\ntdll.dll", fast_load=False)
pe.parse_data_directories()
image = pe.get_memory_mapped_image()

for exp in pe.DIRECTORY_ENTRY_EXPORT.symbols:
    name = exp.name.decode() if exp.name else ""
    if not name.startswith("Nt"):
        continue
    stub = image[exp.address: exp.address + 24]
    # Classic stub: 4C 8B D1  B8 ss ss 00 00  F6 04 25 ...  0F 05  C3
    if stub[0:3] == b"\x4c\x8b\xd1" and stub[3] == 0xB8:
        ssn = struct.unpack("<I", stub[4:8])[0]
        print(f"{name:40s} SSN=0x{ssn:04x}")

Red-team loaders use the same idea at runtime — sometimes against a fresh copy of ntdll read from disk to defeat in-memory EDR hooks (the “Perun’s Fart” / fresh-copy pattern).


7. Wow64 and Heaven’s Gate

A 32-bit process on 64-bit Windows still ultimately issues a 64-bit SYSCALL, because the only kernel entry the CPU honors from a 64-bit process is KiSystemCall64. The Wow64 layer bridges this:

32-bit app -> wow64cpu!CpupReturnFromSimulatedCode
           -> far jmp 0x33:<addr>          ; CS=0x23 (32-bit) -> CS=0x33 (64-bit)
           -> wow64.dll / 64-bit ntdll
           -> SYSCALL

The 0x33 / 0x23 CS selector switch is the so-called Heaven’s Gate (community label, not an official Microsoft term). Malware abuses it to:

  • Execute 64-bit shellcode from a process that defenders are monitoring as a 32-bit target.
  • Issue syscalls that bypass 32-bit ntdll hooks if the EDR only instruments the Wow64 layer.

Analysts should treat any unexpected far jmp to CS=0x33 in 32-bit code as a strong IOC.


8. SSDT Hooking: The Classic Rootkit Technique

Pre-Vista x64, kernel rootkits manipulated KiServiceTable directly:

  1. Locate the descriptor (KeServiceDescriptorTable is exported; the Shadow descriptor was pattern-scanned).
  2. Disable write protection (clear CR0.WP) or remap the page as writable.
  3. Save the original entry for the target SSN (e.g., NtQueryDirectoryFile, NtEnumerateValueKey).
  4. Overwrite the entry with a pointer to attacker code.
  5. The hook calls the original after filtering results — hiding files, registry keys, processes, or network connections.

The illustrative read-only inspection (do not modify) inside a signed test driver:

extern PKSERVICE_TABLE_DESCRIPTOR KeServiceDescriptorTable;

VOID DumpSsdtSizeAndSample(VOID)
{
    PKSERVICE_TABLE_DESCRIPTOR d = KeServiceDescriptorTable;
    PULONG table = (PULONG)d->ServiceTable;

    DbgPrint("[SSDT] TableSize = %lu\n", d->TableSize);

    for (ULONG i = 0; i < 4 && i < d->TableSize; i++) {
        LONG      enc  = (LONG)table[i];
        ULONG_PTR addr = (ULONG_PTR)table + (enc >> 4);
        DbgPrint("[SSDT] [%lu] encoded=0x%08x -> 0x%p\n", i, enc, (PVOID)addr);
    }
}

// Reading LSTAR to confirm KiSystemCall64[Shadow]
VOID DumpLstar(VOID)
{
    ULONG64 lstar = __readmsr(0xC0000082);
    DbgPrint("[MSR] IA32_LSTAR = 0x%llx (KiSystemCall64[Shadow])\n", lstar);
}

Live inspection from WinDbg on a kernel-debugged target:

0: kd> dt nt!_KSERVICE_TABLE_DESCRIPTOR nt!KeServiceDescriptorTable
0: kd> dq  nt!KeServiceDescriptorTable L4
0: kd> dd  nt!KiServiceTable L20
0: kd> u   poi(nt!KiServiceTable) L5
0: kd> rdmsr c0000082

9. PatchGuard (KPP) and Why SSDT Hooking Died

Since x64 Vista, Kernel Patch Protection periodically validates a set of protected structures, including KiServiceTable, IDT, GDT, MSR_LSTAR, kernel image code sections, and several driver objects. On mismatch, KPP issues bugcheck 0x109 — CRITICAL_STRUCTURE_CORRUPTION. The checks run from randomized timers and contexts to resist disablement.

The practical result:

  • SSDT hooking is no longer a viable persistence or hiding primitive on supported 64-bit Windows. Any survival window is short and ends in a BSOD.
  • Modern kernel-mode attackers use driver callbacks (PsSetCreateProcessNotifyRoutine, ObRegisterCallbacks, minifilters) rather than SSDT patching, because those are the supported extension points and are not policed by KPP.
  • With HVCI/Memory Integrity enabled, even loading the malicious driver is gated: kernel pages cannot be both writable and executable, and unsigned kernel code cannot enter ring 0 at all. The hypervisor enforces this at the EPT level — PatchGuard becomes a second line, not the first.

10. Direct and Indirect Syscalls (Modern Red Team TTPs)

Because KPP closed the kernel-side door, evasion moved into user mode. Many EDRs hook the Nt* stubs in ntdll.dll by overwriting the first bytes with a JMP into their inspection DLL. Two techniques bypass that:

  • Direct syscalls. The loader embeds its own mov eax, ssn; syscall; ret stub in attacker memory and calls it instead of ntdll!NtXxx. The hooked ntdll is never touched. SSNs are resolved at runtime (parsing ntdll, sorting Nt* exports by address — the “Hell’s Gate” / “Halo’s Gate” patterns).
  • Indirect syscalls. The mov eax, ssn happens in attacker memory, but the syscall instruction itself is reached by jumping to the syscall byte sequence inside ntdll.dll. The kernel-side return address therefore points back into ntdll, matching what legitimate code looks like in stack-walk telemetry.

The detection signal flips between the two:

TechniqueWhat it bypassesWhat still sees it
Direct syscallntdll user-mode hooksStack walk shows syscall from unbacked / private memory.
Indirect syscallntdll hooks and naive stack-walk checksKernel ETW (Microsoft-Windows-Threat-Intelligence) sees the syscall regardless of where it was issued from.

ETW-TI is the answer to indirect syscalls: it fires from inside the kernel dispatcher, after the SYSCALL has already landed in KiSystemCall64, so the user-mode evasion is irrelevant.


Graph diagram contrasting direct and indirect syscall evasion paths against EDR user-mode hooks, Sysmon CallTrace detection, and kernel-level ETW-TI telemetry firing after the syscall transition
Direct syscalls skip ntdll entirely while indirect syscalls camouflage the return address; ETW-TI catches both because it fires inside the kernel after the ring transition.

11. Common Attacker Techniques

TechniqueDescription
SSDT hook (legacy)Overwrite KiServiceTable[SSN] to filter results for hiding rootkit artifacts; killed by PatchGuard on x64.
Shadow SSDT hookSame against W32pServiceTable to intercept GUI/keyboard/clipboard syscalls.
Direct syscall stubEmbedded mov eax, ssn; syscall in attacker memory to bypass ntdll hooks.
Indirect syscallJump to the syscall gadget inside ntdll so call stacks look legitimate.
Hell’s Gate / Halo’s GateRuntime SSN resolution by parsing/sorting Nt* exports in mapped ntdll.
Fresh-copy ntdllRead clean ntdll.dll from disk to re-derive unhooked stubs and SSNs.
Heaven’s GateFar jump from 32-bit (CS=0x23) to 64-bit (CS=0x33) to execute 64-bit syscalls from a Wow64 process.
Driver-based hookingWhere HVCI is off, signed-but-vulnerable drivers (“BYOVD”) are used to write to MSRs or protected pages.

12. Defensive Strategies & Detection

The detection model has shifted from “watch the SSDT” (PatchGuard already does that) to watch how syscalls are issued from user mode and consume kernel ETW.

Sysmon

Event IDFieldWhy it matters
1ParentImage, CommandLineBaseline; correlates injection target lineage.
10GrantedAccess, CallTraceThe CallTrace field is the primary direct-syscall tell — legitimate stacks contain ntdll.dll; direct syscalls show UNKNOWN(...) or RWX private memory regions.
25Process image tampering / hollowing.

Sigma — direct-syscall NtOpenProcess against LSASS

title: Process Access to LSASS via Direct Syscall (Unbacked Call Stack)
id: 8d0c2a4e-syscall-lsass-unbacked
status: experimental
logsource:
  product: windows
  service: sysmon
detection:
  selection:
    EventID: 10
    TargetImage|endswith: '\lsass.exe'
    GrantedAccess:
      - '0x1010'
      - '0x1410'
      - '0x1fffff'
  unbacked:
    CallTrace|contains:
      - 'UNKNOWN'
      - 'UNKNOWN('
  filter_legit:
    SourceImage|endswith:
      - '\MsMpEng.exe'
      - '\MsSense.exe'
  condition: selection and unbacked and not filter_legit
level: high
tags:
  - attack.credential_access
  - attack.t1003.001
  - attack.t1106

ETW Providers Worth Subscribing To

ProviderUse
Microsoft-Windows-Threat-IntelligenceKernel ETW provider exposing AllocVm, ProtectVm, MapViewOfSection, ReadVm/WriteVm events. Fires from inside the kernel dispatcher, so direct and indirect syscalls are still visible. Consumer must run as PPL.
Microsoft-Windows-Kernel-ProcessProcess and thread creation, image loads.
Microsoft-Windows-Kernel-Audit-API-CallsAudits selected Nt API calls (verify against current SDK).

Audit Policy

  • Audit Sensitive Privilege Use — catches SeDebugPrivilege enabling, a near-universal precursor to syscall-based cross-process injection.
  • Audit Process Creation with command-line capture.
  • Audit Handle Manipulation with object SACLs on lsass.exe.

Hardening

  • HVCI / Memory Integrity — single highest-value control. Blocks unsigned and W^X-violating kernel code; defeats BYOVD primitives that try to disable PatchGuard, patch the SSDT, or clear CR0.WP.
  • VBS + Credential Guard — keeps LSASS secrets off the path even if a syscall reaches NtOpenProcess.
  • KPTI — Meltdown mitigation; also implies KiSystemCall64Shadow is the LSTAR target.
  • Driver Signature Enforcement + Microsoft vulnerable-driver blocklist — limits BYOVD options.
  • EDR ntdll instrumentation — still valuable as a low-cost filter against commodity malware; layer with kernel ETW for the sophisticated cases.

13. Tools for Syscall and SSDT Analysis

ToolDescriptionLink
WinDbgKernel debugger; resolves nt!KeServiceDescriptorTable, nt!KiServiceTable, reads MSRs via rdmsr.learn.microsoft.com
Process HackerLive handle, thread, and module inspection; surfaces RWX private memory regions.processhacker.sourceforge.io
Process MonitorBoot-time and runtime Nt* activity captured via minifilter.learn.microsoft.com
SysmonView / SysmonEID 10 CallTrace, EID 25 telemetry.learn.microsoft.com
HollowsHunter / pe-sieveDetects unbacked / hollowed / patched modules — strong correlator for direct-syscall loaders.github.com/hasherezade
SwishDbgExtWinDbg extension with SSDT dumping and decode of the encoded-offset format.github.com
Volatility 3Memory forensics; windows.ssdt plugin walks the descriptor and decodes entries.volatilityfoundation.org
j00ru syscall tablesAuthoritative per-version SSN reference.j00ru.vexillium.org
SilkETW / SealighterTIUser-friendly consumers for ETW providers including Microsoft-Windows-Threat-Intelligence.github.com

14. MITRE ATT&CK Mapping

TechniqueMITRE IDDetection
Native APIT1106EID 10 CallTrace containing UNKNOWN; ETW-TI AllocVm/ProtectVm from unbacked memory.
Process InjectionT1055Cross-process NtAllocateVirtualMemory + NtWriteVirtualMemory + NtCreateThreadEx chain via ETW-TI.
DLL InjectionT1055.001EID 7/8 plus ETW-TI write/protect events into a remote PID.
PE InjectionT1055.002RWX private allocations followed by remote thread creation.
Process HollowingT1055.012NtUnmapViewOfSection followed by NtWriteVirtualMemory into the primary image base.
RootkitT1014PatchGuard 0x109 bugchecks; SSDT integrity scans in memory forensics.
Impair Defenses: Disable/Modify ToolsT1562.001Driver loads with revoked or vulnerable signatures; HVCI/DSE violations.

Summary

  • Every Windows syscall is a SYSCALL instruction that lands at KiSystemCall64 via MSR_LSTAR and is dispatched through KiServiceTable using the EAX SSN.
  • The SSDT on x64 stores encoded offsets, not raw pointers — base + (entry >> 4) — and the EAX bit 12 selects between the core and Win32k Shadow tables.
  • PatchGuard killed SSDT hooking on x64; modern offense has moved to direct and indirect syscalls in user mode and to BYOVD when ring 0 is required.
  • HVCI/VBS is the strongest defense against the kernel half; kernel ETW (Microsoft-Windows-Threat-Intelligence) is the strongest defense against direct/indirect syscalls because it fires after the transition.
  • Detect with Sysmon EID 10 CallTrace (unbacked memory in the stack), enrich with ETW-TI, and map to MITRE T1106 / T1055 for response.

Related Tutorials

References

HAL and Ntoskrnl: The Kernel Core Components

Objective: Understand the architecture and division of labor between hal.dll (the Hardware Abstraction Layer) and ntoskrnl.exe (the NT kernel and Executive), how they are loaded during boot, the structures and routines each exposes, and how defenders inspect, detect tampering against, and harden these Ring 0 core components.


1. HAL and Ntoskrnl Overview

Two binaries sit at the bottom of Windows kernel mode and everything else builds on them. ntoskrnl.exe is the NT kernel plus the Executive — the policy and service layer of the OS. hal.dll is the Hardware Abstraction Layer — a thin platform shim that hides interrupt controllers, bus topology, timers, and DMA behind a uniform interface so the rest of the kernel stays hardware-independent.

BinaryFull nameLoaded byRing
ntoskrnl.exeNT OS Kernel + Executivewinload.efiRing 0
hal.dllHardware Abstraction Layerwinload.efiRing 0

Both reside in %SystemRoot%\System32\. On multiprocessor systems the SMP-aware image ntkrnlmp.exe is selected by the loader and presented as ntoskrnl.exe; modern Windows 10/11 ships only the SMP variant. Verify image identity and signature on a live host with sigcheck, dumpbin /headers, or the WinDbg lm command. The separation exists for portability (HAL absorbs platform differences) and layering (the kernel implements scheduling and policy, not chipset quirks).


2. Boot Handoff: From Bootloader to KiSystemStartup

winload.efi loads ntoskrnl.exe and hal.dll into memory, then transfers control to the kernel entry point KiSystemStartup, passing a pointer to a LOADER_PARAMETER_BLOCK. That structure carries the memory descriptor list, the ARC hardware tree, NLS data, and other boot-time state the kernel needs before it can manage its own memory.

winload.efi
  └─ loads ntoskrnl.exe + hal.dll
       └─ ntoskrnl!KiSystemStartup(PLOADER_PARAMETER_BLOCK)
            ├─ HalInitializeProcessor()    ; HAL brings up per-CPU hardware
            ├─ KiInitializeKernel()        ; KPCR/KPRCB, IDT, GDT
            ├─ Executive phase init:
            │    Mm/Ob/Se/Io/Cm/Ps InitSystem()
            └─ PsInitialSystemProcess()    ; System process (PID 4)
                 └─ Phase 1: smss.exe launched

HAL initializes the processor before the Executive runs a single line of policy code. Secure Boot validates the winload.efi → ntoskrnl.exe / hal.dll chain in firmware, so tampering with either binary on disk breaks the boot chain on a properly configured machine.


Boot sequence flow diagram showing UEFI firmware validating winload.efi which loads hal.dll and ntoskrnl.exe passing a LOADER_PARAMETER_BLOCK before the Executive initializes
Secure Boot validates each link in the chain; winload.efi loads both HAL and the kernel before handing off control to KiSystemStartup.

3. The HAL: Abstracting the Hardware

The HAL translates abstract requests into platform-specific operations: programming the APIC, translating bus-relative addresses, allocating DMA-coherent buffers, and calibrating the stall timer. Drivers and the kernel call HAL routines instead of touching hardware registers directly.

RoutinePurpose
HalGetInterruptVectorTranslate a bus IRQ to a system interrupt vector and required IRQL
HalTranslateBusAddressConvert a bus-relative address to a logical address
HalAllocateCommonBufferAllocate DMA-coherent memory visible to CPU and device
KeStallExecutionProcessorCalibrated busy-wait (HAL-implemented on most platforms)
HalRequestSoftwareInterruptRequest a software interrupt at a given IRQL to trigger DPC delivery

On modern ACPI systems the HAL is far thinner than in the NT 4 era. Many classic Hal* exports such as HalGetInterruptVector are deprecated; the PnP/ACPI stack and IoConnectInterruptEx now handle interrupt wiring. Since Windows 8, HAL Extensions (halextpcat.dll, halextintc.dll, and similar PE images loaded by HAL itself) carry SoC- and OEM-specific code without replacing the whole HAL.


4. IRQL: The Kernel’s Preemption Ladder

Interrupt Request Level (IRQL) is the central arbitration mechanism shared by HAL and the kernel. The HAL programs the interrupt controller to enforce IRQL in hardware; running at an IRQL masks all interrupts at or below that level on the current CPU.

IRQL (x64)Symbolic nameUsed for
0PASSIVE_LEVELNormal thread execution
1APC_LEVELAPC delivery; paging allowed
2DISPATCH_LEVELScheduler, spin locks; no paging, no blocking
3–12Device IRQLsHardware ISRs
13CLOCK_LEVELClock interrupt
14PROFILE_LEVELProfiling interrupt
15HIGH_LEVELNMI, machine check

The cardinal rule: at DISPATCH_LEVEL or above you may not touch pageable memory or block, because the scheduler and page fault handler cannot run. A driver that dereferences paged-out memory at elevated IRQL produces the classic IRQL_NOT_LESS_OR_EQUAL bug check. Query the current level with KeGetCurrentIrql(). IRQL numeric values are architecture-specific; the table above is the canonical x64 mapping.


Hierarchy diagram of Windows x64 IRQL levels from PASSIVE at 0 up through APC, DISPATCH, CLOCK, IPI, POWER to HIGH at 31 showing preemption priority
Running at DISPATCH_LEVEL or above masks the scheduler and page-fault handler — any pageable memory access at this level triggers an IRQL_NOT_LESS_OR_EQUAL bug check.

5. The Kernel Layer (Ke): Scheduling and Synchronization

The Ke layer sits directly above HAL and implements thread scheduling, interrupt and exception dispatch, and the low-level synchronization primitives the rest of the system depends on.

RoutineWhat it does
KeInitializeSpinLockInitialize a spin-lock object
KeAcquireSpinLockRaise IRQL to DISPATCH_LEVEL and acquire the lock
KeReleaseSpinLockRelease the lock and restore the saved IRQL
KeInsertQueueDpcQueue a Deferred Procedure Call
KeWaitForSingleObjectWait on a dispatcher object (event, mutex, timer, thread)
KeSetEventSet a kernel event to the signaled state

Dispatcher objects — events, mutexes, semaphores, timers, threads — share a common DISPATCHER_HEADER carrying Type, SignalState, and WaitListHead. The wait machinery keys off that header. The synchronization pattern below runs at PASSIVE_LEVEL, where blocking is legal:

KEVENT readyEvent;
KeInitializeEvent(&readyEvent, NotificationEvent, FALSE);

// ... another thread eventually calls KeSetEvent(&readyEvent, IO_NO_INCREMENT, FALSE);

NTSTATUS status = KeWaitForSingleObject(
    &readyEvent,        // dispatcher object
    Executive,          // wait reason
    KernelMode,         // processor mode
    FALSE,              // non-alertable
    NULL);              // no timeout

Per-CPU scheduler state lives in the KPCR (Kernel Processor Control Region), reachable via gs:[0] on x64, with an embedded KPRCB holding CurrentThread, NextThread, IdleThread, and the DPC queue.


6. The Executive Layer (Ex and Friends)

The Executive comprises the higher-level managers, each identified by a two-letter prefix. They build on Ke primitives and HAL services.

ManagerPrefixResponsibilities
Object ManagerObObject lifecycle, handles, reference counting
Process/Thread ManagerPsEPROCESS/ETHREAD creation and teardown
Memory ManagerMmVAD trees, PTEs, page faults, pool
I/O ManagerIoIRP lifecycle, driver loading
Security Reference MonitorSeAccess checks, tokens, privileges
Configuration ManagerCmRegistry hive management
Executive SupportExPool allocation, lookaside lists, callbacks

Correct pool usage on modern Windows uses ExAllocatePool2 (the successor to ExAllocatePoolWithTag, deprecated starting Windows 10 build 19041) paired with ExFreePoolWithTag:

// Allocate non-paged pool with a 4-byte tag (read in WinDbg as 'XgAT').
PVOID buffer = ExAllocatePool2(POOL_FLAG_NON_PAGED, 0x1000, 'TAgX');
if (buffer != NULL) {
    // ... use buffer at IRQL <= DISPATCH_LEVEL ...
    ExFreePoolWithTag(buffer, 'TAgX');
}

The Object Manager exposes ObReferenceObjectByHandle to convert a handle into a referenced kernel object pointer — the gateway every component crosses when validating access.


7. Key Kernel Structures

A handful of structures are the backbone of process, thread, and CPU state. Defenders and rootkit authors alike walk these every day.

StructureKey fields
EPROCESSUniqueProcessId, ActiveProcessLinks, Token, VadRoot, Peb, ImageFileName[15], ThreadListHead
ETHREADCid (CLIENT_ID), ThreadListEntry, Win32StartAddress, embedded KTHREAD
KTHREADHeader (DISPATCHER_HEADER), KernelStack, State, WaitIrql, Teb
KPCRPer-CPU; IRQL, IDT/GDT pointers, pointer to KPRCB
KPRCBCurrentThread, NextThread, IdleThread, DPC queue
KDPCDeferredRoutine, DeferredContext, DpcListEntry

ActiveProcessLinks is a doubly linked LIST_ENTRY chaining every EPROCESS. The Task Manager view of “all processes” is, at bottom, a walk of this list. That makes it a prime DKOM target: unlinking an EPROCESS hides the process from list-based enumeration while it continues to run and be scheduled — covered in Section 10.


8. The SSDT and System Call Dispatch

A user-mode SYSCALL instruction transfers Ring 3 → Ring 0 and lands in ntoskrnl!KiSystemCall64. The dispatcher indexes the System Service Dispatch Table via KeServiceDescriptorTable, which points at KiServiceTable (an array of service routine offsets) and KiArgumentTable (argument byte counts). GUI calls into win32k.sys route through the shadow table KeServiceDescriptorTableShadow.

Patching KiServiceTable so a service index points at attacker code is the classic SSDT hook, historically used by rootkits to intercept NtQuerySystemInformation, NtOpenProcess, and similar. On x64 this is exactly the kind of structure modification PatchGuard validates, so SSDT hooking is loud and largely obsolete on modern systems — but understanding the dispatch path is essential for reading both live disassembly and integrity-check telemetry.


Flow diagram of the Windows system call dispatch path from user-mode SYSCALL instruction through KiSystemCall64 and KeServiceDescriptorTable to the target Nt service routine
The SYSCALL instruction transfers execution to KiSystemCall64, which uses the service index to look up the target routine in KiServiceTable — the structure SSDT hooks manipulate and PatchGuard protects.

9. Live Analysis with WinDbg and Volatility

Load Microsoft symbols and the entire layout becomes navigable. List the core modules and dump structures directly:

0: kd> lm m nt              ; ntoskrnl base, range, symbols
0: kd> lm m hal             ; hal.dll base and range
0: kd> dt nt!_EPROCESS      ; full EPROCESS field layout
0: kd> !process 0 0         ; enumerate processes via ActiveProcessLinks
0: kd> !pcr 0               ; KPCR for CPU 0
0: kd> !prcb 0              ; KPRCB: CurrentThread / IdleThread
0: kd> dps nt!KeServiceDescriptorTable   ; SSDT pointer + service count
0: kd> !idt                 ; IDT vectors (HAL-programmed interrupt routing)

For dead-box memory forensics, Volatility 3 reconstructs the same view from a dump and is the natural cross-check against a possibly compromised live host:

# Enumerate processes and loaded kernel modules from a memory image.
vol -f memory.dmp windows.pslist
vol -f memory.dmp windows.modules

# psscan walks pool tags instead of ActiveProcessLinks; a process that
# appears in psscan but NOT in pslist is a candidate DKOM-unlinked process.
vol -f memory.dmp windows.psscan

A delta between windows.pslist (list-based) and windows.psscan (pool-scan-based) is a high-fidelity indicator of ActiveProcessLinks tampering.


10. Common Attacker Techniques

Kernel-core abuse turns on either modifying ntoskrnl structures from a loaded driver or exploiting a vulnerability to reach Ring 0 in the first place.

TechniqueDescription
SSDT hookingPatch KiServiceTable entries to intercept syscalls
DKOM unlinkingSplice an EPROCESS out of ActiveProcessLinks to hide a process
Kernel callback removalStrip PsSetCreateProcessNotifyRoutine entries to blind EDR
BYOVDLoad a vulnerable signed driver to gain a Ring 0 primitive
Kernel exploitationAbuse an ntoskrnl/HAL bug to escalate Ring 3 → Ring 0
In-memory image patchPatch ntoskrnl.exe code pages at runtime

A malicious driver is still loaded through the documented path — a Services registry key of Type = 1 followed by a load — which is exactly where detection begins. Bring-Your-Own-Vulnerable-Driver remains popular precisely because it sidesteps the need to find a fresh kernel bug.


Graph diagram showing attacker path from BYOVD through Ring 0 code execution branching into DKOM process unlinking, SSDT hooking, and callback removal all leading to hidden process or driver impact
BYOVD is the most common Ring 0 entry point; once there, attackers choose between DKOM, SSDT hooks, or callback removal to achieve persistence and evasion.

11. Defensive Strategies & Detection

Detection centers on driver loads, integrity events, and kernel structure cross-checks.

Sysmon Event IDNameRelevance
6Driver LoadedKernel driver load with Signed, Hashes, Signature fields
7Image LoadedModule loads in unusual contexts
13Registry Value SetNew Services driver entries

Pair Sysmon with Windows event sources: System Event ID 7045 (new kernel-mode service installed), Security Event ID 5038 (image hash invalid — DSE failure), and Event ID 6281 (page hash mismatch). The Microsoft-Windows-Kernel-Memory ETW provider surfaces pool allocations useful for hunting pool-based implants.

title: Suspicious Unsigned Kernel Driver Load
logsource:
  product: windows
  service: sysmon
detection:
  selection:
    EventID: 6
    Signed: 'false'
  filter_legit:
    ImageLoaded|startswith:
      - 'C:\Windows\System32\drivers\'
      - 'C:\Windows\System32\DriverStore\'
  condition: selection and not filter_legit
level: high
MechanismDescription
PatchGuard (KPP)Validates SSDT, IDT, GDT, KPCR, and kernel code; bug check 0x109 on tampering
Driver Signature Enforcementci.dll requires Authenticode-signed drivers
HVCIVTL1 enforces signed Ring 0 code; blunts BYOVD and runtime patching
Secure BootValidates the winload → ntoskrnl/hal chain in firmware

Operational hardening: enable HVCI (Core Isolation → Memory Integrity), confirm Secure Boot in msinfo32, audit SeLoadDriverPrivilege use, deploy the Microsoft Vulnerable Driver Blocklist (DriverSiPolicy.p7b), monitor HKLM\SYSTEM\CurrentControlSet\Services\ for new Type = 1 entries, and baseline loaded-module hashes against periodic WinPmem/Volatility snapshots.


12. MITRE ATT&CK Mapping

TechniqueMITRE IDDetection
RootkitT1014Volatility pslist/psscan delta; PatchGuard bug check 0x109
Kernel Modules and ExtensionsT1547.006Sysmon EID 6; Event ID 7045; Services key writes
Exploitation for Privilege EscalationT1068Crash telemetry, anomalous Ring 0 transitions
Impair DefensesT1562.001Missing kernel callbacks; EDR self-protection alerts
Process InjectionT1055Kernel KeStackAttachProcess/MmCopyVirtualMemory use
Modify System ImageT1601.001Code integrity Event ID 5038/6281; PatchGuard

13. Tools for Kernel Analysis

ToolDescriptionLink
WinDbgLive and dump kernel debugging, structure walksmicrosoft.com
Volatility 3Memory forensics, pslist/psscan/modulesvolatilityfoundation.org
WinPmemLive memory acquisitiongithub.com
Process HackerDriver and handle inspectionprocesshacker.sourceforge.io
SysmonDriver-load and registry telemetrysysinternals.com
sigcheckImage signature and hash verificationsysinternals.com
GhidraStatic analysis of drivers and ntoskrnlghidra-sre.org

14. Summary

  • HAL and ntoskrnl are the two Ring 0 binaries every other Windows component is built on — HAL abstracts hardware, ntoskrnl implements the kernel and Executive policy layers.
  • The kernel layer (Ke) supplies scheduling and synchronization; the Executive (Ob, Ps, Mm, Io, Se, Cm, Ex) builds managers on top, all arbitrated by IRQL that the HAL enforces in hardware.
  • Core structures — EPROCESS, ETHREAD, KPCR, the SSDT — are the backbone of process and CPU state and the prime targets for SSDT hooks, DKOM unlinking, and callback removal.
  • Detect kernel tampering via Sysmon Event ID 6, Event IDs 7045/5038/6281, and Volatility pslist-vs-psscan deltas; prevent it with HVCI, DSE, Secure Boot, and the vulnerable-driver blocklist.

Related Tutorials

References