Access Tokens and Privileges: The Kernel’s Security Context
Run whoami /priv on an admin shell. You’ll see a column labeled State, and most of the entries — including SeDebugPrivilege and SeImpersonatePrivilege — read Disabled. They aren’t missing. They’re sitting in the token, dormant, waiting for a BOOL flip. That single column is the entire story of most Windows post-exploitation tradecraft in one place: not forging anything, just enabling what was already issued.
Objective: Understand how Windows builds and enforces a per-process security context through the access token, how the Security Reference Monitor uses that token on every object access, and which token operations defenders need to see to catch impersonation, theft, and privilege enablement.
1. Why Tokens Exist
When you authenticate, LSASS (lsass.exe) creates a logon session, derives a primary access token from that session, and hands it to whatever process is being started for you — userinit.exe, then explorer.exe. From that point forward, every kernel object you touch — files, registry keys, named pipes, processes, threads — is evaluated against that token by the Security Reference Monitor (SRM).
The SRM lives in the kernel and does one job: when a thread asks for access to an object, compare the thread’s effective token to the object’s security descriptor and return a yes/no. That comparison happens in SeAccessCheck (kernel) and is surfaced to user mode as AccessCheck. The order matters — Integrity Level check → DACL check → Privilege check.
Without a token, the kernel has no answer to “who is this thread, and what is it allowed to do?” Tokens aren’t a wrapper around credentials. They are the runtime identity.

2. Inside nt!_TOKEN
The kernel object is nt!_TOKEN. It’s undocumented — Microsoft exposes Win32 wrappers, not field layouts — but you can inspect it on your own build:
0: kd> dt nt!_TOKENThe layout shifts between Windows versions, so never hardcode offsets. The fields that matter conceptually are stable:
| Field | Purpose |
|---|---|
TokenId | LUID uniquely identifying this token instance |
AuthenticationId | LUID of the originating logon session |
TokenType | TokenPrimary (1) or TokenImpersonation (2) |
ImpersonationLevel | Only meaningful for impersonation tokens |
UserAndGroups | Array of SID_AND_ATTRIBUTES — user SID plus group SIDs |
Privileges | SEP_TOKEN_PRIVILEGES — three 64-bit privilege bitmasks |
IntegrityLevelIndex | Index into UserAndGroups pointing at the mandatory label |
LogonSession | Pointer to SEP_LOGON_SESSION_REFERENCES |
DefaultDacl | DACL applied to objects this token creates |
SessionId | RDP / Terminal Services session ID |
The Privileges member is worth dwelling on. SEP_TOKEN_PRIVILEGES carries three 64-bit bitmasks — Present, Enabled, and EnabledByDefault — and that three-state design is the entire reason “privilege escalation” can be a one-API-call affair (covered in §6). This layout is community-observed via WinDbg and ReactOS source; treat it as undocumented and verify on your target build.

3. Primary vs. Impersonation Tokens
Every process has exactly one primary token, set at CreateProcess time and fixed for the lifetime of the process. You don’t swap it. To run code under a different identity, you start a new process with a different token (CreateProcessAsUser, CreateProcessWithTokenW).
Threads are different. A thread can carry an impersonation token that temporarily overrides the process’s primary token for that thread only. This is how RPC servers, named-pipe servers, and IIS worker threads handle requests on behalf of multiple callers without spawning a process each time. The kernel keeps it in _KTHREAD.ImpersonationInfo; SeAccessCheck prefers the thread token over the process token if one is present.
The distinction matters at detection time too. OpenProcessToken returns the primary token; OpenThreadToken returns the impersonation token, if any. A thread calling OpenThreadToken and getting ERROR_NO_TOKEN is normal — most threads aren’t impersonating. A thread calling it and getting SYSTEM is not.

4. Integrity Levels and Mandatory Integrity Control
Mandatory Integrity Control (MIC) added a sideband label to the token and a corresponding mandatory label ACE in object SACLs. Five well-known integrity SIDs cover the practical range:
| SID | Level | Typical Use |
|---|---|---|
S-1-16-0 | Untrusted | Heavily sandboxed code |
S-1-16-4096 | Low | Browser renderers, AppContainer |
S-1-16-8192 | Medium | Default for interactive user processes |
S-1-16-12288 | High | Elevated (post-UAC) admin processes |
S-1-16-16384 | System | SYSTEM-account services and kernel components |
The label sits in UserAndGroups at index IntegrityLevelIndex, retrievable from user mode via GetTokenInformation(..., TokenIntegrityLevel, ...) into a TOKEN_MANDATORY_LABEL. MIC’s enforcement rule is simple: a process at a lower integrity level cannot write to or modify a higher-integrity object belonging to the same user — no DLL injection, no token impersonation up the chain. That single rule is what stops a Medium-IL Word process from injecting into a High-IL elevated PowerShell.
5. Reading a Token from User Mode
The minimum useful query: open the token, ask for the user SID, print it.
HANDLE hToken = NULL;
if (!OpenProcessToken(GetCurrentProcess(), TOKEN_QUERY, &hToken)) {
return GetLastError();
}
DWORD cbUser = 0;
GetTokenInformation(hToken, TokenUser, NULL, 0, &cbUser);
PTOKEN_USER pUser = (PTOKEN_USER)LocalAlloc(LPTR, cbUser);
if (GetTokenInformation(hToken, TokenUser, pUser, cbUser, &cbUser)) {
LPWSTR sidStr = NULL;
ConvertSidToStringSidW(pUser->User.Sid, &sidStr);
wprintf(L"User SID: %s\n", sidStr);
LocalFree(sidStr);
}
LocalFree(pUser);
CloseHandle(hToken);The same GetTokenInformation call with TokenGroups returns a TOKEN_GROUPS you can walk to see which groups are SE_GROUP_ENABLED, SE_GROUP_MANDATORY, or SE_GROUP_INTEGRITY (that last flag is how you find the IL label without parsing the index). TokenPrivileges returns a TOKEN_PRIVILEGES and feeds the next section.
For integrity level specifically:
DWORD cb = 0;
GetTokenInformation(hToken, TokenIntegrityLevel, NULL, 0, &cb);
PTOKEN_MANDATORY_LABEL pLabel = (PTOKEN_MANDATORY_LABEL)LocalAlloc(LPTR, cb);
GetTokenInformation(hToken, TokenIntegrityLevel, pLabel, cb, &cb);
DWORD rid = *GetSidSubAuthority(
pLabel->Label.Sid,
(DWORD)(UCHAR)(*GetSidSubAuthorityCount(pLabel->Label.Sid) - 1));
// rid == 0x2000 (8192) -> Medium
// rid == 0x3000 (12288) -> High
// rid == 0x4000 (16384) -> System6. Privileges: Present, Enabled, Removed
A privilege has three independent states inside the token:
- Present — the privilege exists in the token. Cannot be added at runtime by user mode.
- Enabled — the privilege is currently active for access checks.
- Removed — once a privilege is removed via
SE_PRIVILEGE_REMOVED, it’s gone for the life of the token.
AdjustTokenPrivileges only moves a privilege between “present and disabled” and “present and enabled.” It cannot grant a privilege the token never had. So when a tool “enables SeDebugPrivilege,” it isn’t gaining authority — that authority was issued at logon and waiting in the Present bitmask. The enable is purely a flag flip.
HANDLE hToken;
LUID luid;
TOKEN_PRIVILEGES tp = {0};
OpenProcessToken(GetCurrentProcess(),
TOKEN_ADJUST_PRIVILEGES | TOKEN_QUERY,
&hToken);
LookupPrivilegeValueW(NULL, SE_DEBUG_NAME, &luid);
tp.PrivilegeCount = 1;
tp.Privileges[0].Luid = luid;
tp.Privileges[0].Attributes = SE_PRIVILEGE_ENABLED;
AdjustTokenPrivileges(hToken, FALSE, &tp, sizeof(tp), NULL, NULL);
if (GetLastError() == ERROR_NOT_ALL_ASSIGNED) {
// Privilege wasn't Present in the token -> not actually enabled.
}That ERROR_NOT_ALL_ASSIGNED check is the gotcha most first-timers miss: AdjustTokenPrivileges returns TRUE even when the privilege isn’t in Present. The real outcome is only visible through GetLastError. I’ve burned a solid afternoon staring at a “successful” call that did nothing because the calling process was unelevated and SeDebugPrivilege was never issued in the first place.
The privileges worth keeping at the top of a defender’s list:
| Privilege | Why It Matters |
|---|---|
SeDebugPrivilege | Open any process, including LSASS, for read/write |
SeImpersonatePrivilege | Precondition for the Potato family of escalations |
SeAssignPrimaryTokenPrivilege | Replace a process’s primary token |
SeTcbPrivilege | “Act as part of the OS” — essentially unrestricted |
SeLoadDriverPrivilege | Load arbitrary kernel drivers → BYOVD |
SeBackupPrivilege / SeRestorePrivilege | Read/write any file regardless of DACL |
SeTakeOwnershipPrivilege | Seize ownership of any object |
SeCreateTokenPrivilege | Forge tokens directly — held only by SYSTEM |
7. Impersonation in Depth
SECURITY_IMPERSONATION_LEVEL defines how far the impersonating thread can act on behalf of the original principal:
| Level | Meaning |
|---|---|
SecurityAnonymous | Server cannot identify or impersonate the client |
SecurityIdentification | Server can identify but not act as the client |
SecurityImpersonation | Server can act as the client on the local machine |
SecurityDelegation | Server can act as the client on local and remote systems |
The canonical sequence for a service impersonating a caller:
HANDLE hClient;
DuplicateTokenEx(hSourceToken,
TOKEN_ALL_ACCESS,
NULL,
SecurityImpersonation,
TokenImpersonation,
&hClient);
SetThreadToken(NULL, hClient); // current thread now runs as the client
// ... perform the work that requires the client's identity ...
RevertToSelf(); // back to the process's primary token
CloseHandle(hClient);SECURITY_QUALITY_OF_SERVICE controls whether impersonation tracks the source statically or dynamically, and whether only the enabled privileges follow (EffectiveOnly). That last flag is one of the more interesting defensive levers — a service calling impersonation with EffectiveOnly = TRUE strips dormant privileges out of the impersonation context entirely.
8. Duplication, LogonUser, and Process Creation Under a Token
Three primitives cover most of the “run something as someone else” surface:
DuplicateTokenEx— clone an existing token, optionally upgrading from impersonation to primary type. RequiresTOKEN_DUPLICATEon the source.LogonUser— authenticate a username/password and receive a fresh primary token tied to a new logon session.CreateProcessWithTokenW— start a new process whose primary token is the one you pass in. RequiresSeImpersonatePrivilegeon the caller.
The MITRE taxonomy splits the abuse cleanly along these primitives:
- T1134.001 — Token Impersonation/Theft.
OpenProcessTokenagainst a higher-privileged process,DuplicateTokenEx, thenImpersonateLoggedOnUserorSetThreadToken. No credentials needed; you steal what’s already running. - T1134.002 — Create Process with Token. Same theft, but you go straight to
CreateProcessWithTokenWto start a new process under the stolen identity rather than impersonating on a thread. - T1134.003 — Make and Impersonate Token.
LogonUserwith credentials in hand, thenSetThreadToken. Quieter than theft because the resulting logon looks legitimate — but it generates a 4624 you can see.

9. _EPROCESS.Token and Kernel-Mode Abuse
The kernel’s view of a process’s primary token is the Token field in _EPROCESS, an EX_FAST_REF — a pointer with reference-count bits packed into the low bits. A kernel exploit with arbitrary write can overwrite that field with a pointer to the SYSTEM process’s token, instantly upgrading the attacker’s process to SYSTEM without touching any user-mode API.
Walking it in WinDbg looks like this:
0: kd> !process 0 0 explorer.exe
PROCESS ffffba0c1a5f6080 ...
0: kd> dt nt!_EPROCESS ffffba0c1a5f6080 Token
+0x4b8 Token : _EX_FAST_REF
0: kd> dt nt!_TOKEN (poi(ffffba0c1a5f6080+0x4b8) & ~0xf)The offset will not be 0x4b8 on your build. Use dt to find it on the system you’re analyzing.
For defenders, the operational takeaway is that kernel-mode token swapping leaves no user-mode footprint — no AdjustTokenPrivileges, no OpenProcessToken, no 4703. The detection has to shift earlier: catch the driver load (SeLoadDriverPrivilege use, signed-driver loader events) or the exploit’s user-mode loader, because by the time the swap happens your audit pipeline is blind to it.
10. Detection and Defense
Token abuse leaves observable traces across the Security log, Sysmon, and ETW. Pick the events that match the primitive you’re hunting.
Windows Security Audit Events
| Event ID | Name | What It Tells You |
|---|---|---|
4624 | Successful logon | New logon session and primary token; check LogonType |
4648 | Logon with explicit credentials | runas, CreateProcessWithLogonW, lateral movement |
4672 | Special privileges assigned to new logon | Sensitive privileges granted at session start |
4673 | Privileged service called | Use of sensitive privilege |
4688 | New process created | Includes TokenElevationType (1/2/3) |
4703 | User right adjusted | AdjustTokenPrivileges calls — the core privilege-enable signal |
4672 is high-value: it fires once per privileged logon and lists the sensitive privileges assigned. Filter out the well-known principals (LOCAL SYSTEM, NETWORK SERVICE, LOCAL SERVICE) and expected admins. What’s left is worth a look — that’s where Mimikatz-style pass-the-hash and elevation activity surfaces.
Sysmon
- EID 1 (Process Create) —
IntegrityLevelandUserfields directly show the process’s effective token. A child of a Medium-IL process suddenly running at System integrity is a hard signal. - EID 10 (ProcessAccess) —
OpenProcessagainst LSASS or other high-value targets. WatchGrantedAccessmasks like0x1400(PROCESS_QUERY_INFORMATION | PROCESS_QUERY_LIMITED_INFORMATION) and0x40(PROCESS_DUP_HANDLE). - EID 8 (CreateRemoteThread) — cross-process injection that frequently follows token theft.
Sigma Sketch: Privilege Enable on a Sensitive Right
title: Sensitive Privilege Adjusted via AdjustTokenPrivileges
logsource:
product: windows
service: security
detection:
selection:
EventID: 4703
EnabledPrivilegeList|contains:
- 'SeDebugPrivilege'
- 'SeImpersonatePrivilege'
- 'SeTcbPrivilege'
- 'SeLoadDriverPrivilege'
filter_known:
SubjectUserSid:
- 'S-1-5-18' # LOCAL SYSTEM
- 'S-1-5-19' # LOCAL SERVICE
- 'S-1-5-20' # NETWORK SERVICE
condition: selection and not filter_known
level: highTo produce 4703, the Audit Token Right Adjusted subcategory has to be enabled — it isn’t by default on most builds. Same goes for Audit Sensitive Privilege Use for 4673/4674, and command-line logging in 4688 (Group Policy: System → Audit Process Creation → Include command line).
ETW Providers
| Provider | What It Carries |
|---|---|
Microsoft-Windows-Security-Auditing | All audit events above |
Microsoft-Windows-Kernel-Process | Process/thread lifecycle including token assignment |
Microsoft-Windows-Threat-Intelligence | High-fidelity process-access telemetry; PPL consumer only (Defender/EDR) |
Hardening
SeCreateTokenPrivilege→ SYSTEM only. Nothing else needs it.SeAssignPrimaryTokenPrivilege→ local/network service accounts only. Audit anything else holding it.- Strip
SeImpersonatePrivilegefrom service accounts that don’t host RPC or named-pipe endpoints. Its presence is the precondition for the Potato family. - PPL for critical services — blocks
OpenProcesswith token-access rights from unprotected callers. - Credential Guard — isolates logon-session secrets in VSM,
Related Tutorials
- SIDs and Security Descriptors: Identity in Windows Security
- System Calls and SSDT: How User Mode Reaches the Kernel
- HAL and Ntoskrnl: The Kernel Core Components
- User Mode vs Kernel Mode: Privilege Rings and the Boundary
- Fibers: User-Mode Cooperative Threads
References
- Access Tokens – Win32 apps | Microsoft Learn
- Privilege Constants (Winnt.h) – Win32 apps | Microsoft Learn
- Windows Kernel-Mode Security Reference Monitor | Microsoft Learn
- Access Token Manipulation, Technique T1134 – Enterprise | MITRE ATT&CK®
- Introduction to Windows Tokens for Security Practitioners | Elastic
SIDs and Security Descriptors: Identity in Windows Security
A thread opens a handle to a file. Before a single byte is read, the kernel has already answered a question nobody typed: is the caller’s identity allowed to do this? That answer lives at the intersection of two structures — the SID that names who you are, and the security descriptor that says who gets in. Get the relationship between them wrong and you ship a world-writable service. Understand it, and most “weird permission” incidents stop being mysterious.
Objective: Understand how Windows represents identity with Security Identifiers, how Security Descriptors bind owners, DACLs, and SACLs to every securable object, and how attackers abuse — and defenders detect — manipulation of both.
1. Identity Before Access
Windows authenticates security principals — anything the OS can prove an identity for: users, groups, computers, and service accounts. Authentication is the LSA’s job; the SAM (local) or the domain’s NTDS.dit (Active Directory) stores the account records. But authentication only proves who you are. Authorization — what you may touch — is a separate decision made against a different value: the SID.
A SID is the canonical, machine-readable name for a principal. Display names change. SAM account names get reused. SIDs do not. Once the system mints a SID at account-creation time, that value is never reused to identify another principal, even after the account is deleted. Every authorization check in the OS compares SIDs, never names.
2. Anatomy of a SID
A SID is a variable-length binary structure, defined as SID in winnt.h. Three logical parts: a revision, the issuing authority, and a chain of sub-authorities ending in a Relative Identifier (RID).
| Field | Type | Meaning |
|---|---|---|
Revision | BYTE | SID structure version — always 1 |
SubAuthorityCount | BYTE | Number of sub-authority values (max 15) |
IdentifierAuthority | SID_IDENTIFIER_AUTHORITY | 6-byte top-level authority that issued the SID |
SubAuthority[] | DWORD[] | Sub-authority values; the last element is the RID |
The string notation everyone recognizes is just those fields, hyphenated. Take S-1-5-21-<d1>-<d2>-<d3>-513:
S-1— a revision-1 SID.5—SECURITY_NT_AUTHORITY, marking it a Windows NT SID.21—SECURITY_NT_NON_UNIQUE, signaling that a domain identifier follows.<d1>-<d2>-<d3>— three 32-bit values randomly generated to uniquely identify the domain.513— the RID; here, the well-known RID for Domain Users.
You rarely build SIDs by hand. You parse them. Here’s the field-level walk in C — note that the documented accessors (GetSidSubAuthority, GetSidIdentifierAuthority) return pointers into the structure, which trips up everyone the first time:
#include <windows.h>
#include <sddl.h>
#include <stdio.h>
void PrintSid(PSID pSid) {
if (!IsValidSid(pSid)) return;
PSID_IDENTIFIER_AUTHORITY pAuth = GetSidIdentifierAuthority(pSid);
DWORD subCount = *GetSidSubAuthorityCount(pSid);
printf("Authority: %u\n", (DWORD)pAuth->Value[5]); // NT authority lives in the low byte
for (DWORD i = 0; i < subCount; i++)
printf(" SubAuthority[%lu] = %lu\n", i, *GetSidSubAuthority(pSid, i));
LPSTR str = NULL;
if (ConvertSidToStringSidA(pSid, &str)) { // -> "S-1-5-..."
printf("String SID: %s\n", str);
LocalFree(str);
}
}To go the other direction — constructing a known SID — use AllocateAndInitializeSid, which takes an authority plus up to eight sub-authorities. Building the SYSTEM SID (S-1-5-18) and comparing it with EqualSid is the idiomatic way to check “am I running as LocalSystem?”:
SID_IDENTIFIER_AUTHORITY ntAuth = SECURITY_NT_AUTHORITY; // {0,0,0,0,0,5}
PSID pSystem = NULL;
if (AllocateAndInitializeSid(&ntAuth, 1,
SECURITY_LOCAL_SYSTEM_RID, // 18
0, 0, 0, 0, 0, 0, 0, &pSystem)) {
// EqualSid(tokenSid, pSystem) -> TRUE means LocalSystem
FreeSid(pSystem); // never free this with LocalFree
}3. Well-Known SIDs and Built-in Principals
Some SIDs are identical on every Windows install. Hard-coding their strings is a bug waiting to happen across locales and versions; use the documented constants where you can. Memorize the ones below anyway — you’ll read them in logs daily.
| SID | Principal |
|---|---|
S-1-0-0 | Null SID (a group with no members) |
S-1-1-0 | Everyone |
S-1-5-18 | Local System |
S-1-5-19 | Local Service |
S-1-5-20 | Network Service |
S-1-5-32-544 | Builtin\Administrators |
S-1-16-12288 | High mandatory integrity level |
Built-in accounts also carry well-known RIDs appended to the domain or machine SID: 500 is Administrator, 501 is Guest, 512 is Domain Admins. An attacker enumerating a domain looks for RID 500 and 512 specifically — the display name can be renamed, the RID cannot. Capability SIDs the OS recognizes are cached under HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\SecurityManager\CapabilityClasses\AllCachedCapabilities.
4. SIDs at Runtime: The Access Token
When a user signs in, LSA builds an access token for the session. That token is the runtime bag of identity: the user’s SID, the SIDs of every group the user belongs to, the privileges granted, and a mandatory integrity level SID (the S-1-16-* family). Every process started in that logon context inherits a copy. When code makes an access check, the kernel compares the SIDs in the token against the SIDs in the object’s DACL.
One detail that becomes an attack surface later: an account can carry extra SIDs in its Active Directory sIDHistory attribute. That attribute exists for legitimate domain migration — copy the old SID into sIDHistory so a migrated user keeps access to resources permissioned to the old account without re-ACLing everything. The catch is that all values in sIDHistory are injected into the access token at logon, exactly as if they were primary group memberships.

5. The Security Descriptor: Structure and Fields
Every object the Object Manager creates has a security descriptor. The structure is SECURITY_DESCRIPTOR, reproduced here verbatim from winnt.h:
typedef struct _SECURITY_DESCRIPTOR {
BYTE Revision;
BYTE Sbz1;
SECURITY_DESCRIPTOR_CONTROL Control;
PSID Owner;
PSID Group;
PACL Sacl;
PACL Dacl;
} SECURITY_DESCRIPTOR, *PISECURITY_DESCRIPTOR;Field by field: Revision is always 1; Sbz1 is reserved and must be zero; Control is a flag bitmask; Owner and Group point to SIDs; Dacl and Sacl point to access-control lists. The internal layout differs between absolute form (the struct holds pointers to separately allocated SIDs and ACLs) and self-relative form (everything packed into one contiguous blob with offsets, marked by SE_SELF_RELATIVE). Because that format varies, never poke fields directly — drive it through the API.
The Control field qualifies how the rest of the descriptor is interpreted:
| Flag | Meaning |
|---|---|
SE_DACL_PRESENT | The descriptor has a DACL (the pointer may still be NULL) |
SE_SACL_PRESENT | The descriptor has a SACL |
SE_DACL_PROTECTED | DACL is shielded from inherited ACEs |
SE_SACL_PROTECTED | SACL is shielded from inherited ACEs |
SE_OWNER_DEFAULTED | Owner was assigned by a default mechanism |
SE_SELF_RELATIVE | Descriptor is in packed, self-relative form |
Here is the single most important gotcha in this entire topic, and it has burned production systems repeatedly. There is a difference between no DACL, an empty DACL, and a NULL DACL:
SECURITY_DESCRIPTOR sd;
InitializeSecurityDescriptor(&sd, SECURITY_DESCRIPTOR_REVISION);
// NULL DACL: present == TRUE, pointer == NULL -> GRANTS EVERYONE FULL ACCESS
SetSecurityDescriptorDacl(&sd, TRUE, NULL, FALSE);
// Empty DACL: present == TRUE, non-NULL ACL with zero ACEs -> DENIES EVERYONE
// (initialize an ACL with InitializeAcl and add no ACEs, then pass it here)If SE_DACL_PRESENT is not set, or it is set with a NULL DACL pointer, the object allows full access to everyone. Developers reach for SetSecurityDescriptorDacl(&sd, TRUE, NULL, FALSE) thinking “no restrictions, default behavior” and ship a world-writable named pipe or service. An empty DACL — present, non-NULL, zero ACEs — does the opposite and denies everyone. One null pointer is the difference.

6. DACLs and ACEs: How Access Is Decided
A DACL is an ordered list of Access Control Entries. Each ACE has an ACE_HEADER (AceType, AceFlags, AceSize), an ACCESS_MASK of rights, and a trailing SID the entry applies to.
| ACE Type | Used In | Effect |
|---|---|---|
ACCESS_ALLOWED_ACE | DACL | Grants rights in its mask to the SID |
ACCESS_DENIED_ACE | DACL | Denies rights in its mask to the SID |
SYSTEM_AUDIT_ACE | SACL | Logs access matching its mask |
Evaluation order matters: the kernel walks ACEs top to bottom and stops as soon as the requested access is fully granted or any of it is denied. Well-formed (canonical) DACLs place deny ACEs ahead of allow ACEs precisely so a deny is seen first. An ACL has no hard ACE-count limit, but the whole ACL must stay under 64 KB.
Reading a real object’s DACL means pulling the descriptor and iterating ACEs by index with GetAce:
PSECURITY_DESCRIPTOR pSD = NULL;
PSID pOwner = NULL;
PACL pDacl = NULL;
DWORD rc = GetNamedSecurityInfoW(
L"C:\\Windows\\System32\\config\\SAM", SE_FILE_OBJECT,
OWNER_SECURITY_INFORMATION | DACL_SECURITY_INFORMATION,
&pOwner, NULL, &pDacl, NULL, &pSD);
if (rc == ERROR_SUCCESS && pDacl) {
for (WORD i = 0; i < pDacl->AceCount; i++) {
PACE_HEADER hdr = NULL;
if (GetAce(pDacl, i, (LPVOID*)&hdr)) {
// hdr->AceType == ACCESS_ALLOWED_ACE_TYPE / ACCESS_DENIED_ACE_TYPE
// hdr->AceFlags == CONTAINER_INHERIT_ACE | OBJECT_INHERIT_ACE | ...
}
}
LocalFree(pSD);
}7. SACLs: Auditing Through the System ACL
The SACL uses the same ACL container but holds SYSTEM_AUDIT_ACE entries instead. Its access mask doesn’t grant or deny anything — it defines which access attempts generate audit records in the Windows Security Event Log. Reading or writing any object’s SACL requires the SeSecurityPrivilege right, which only Administrators normally hold. That privilege boundary is exactly why SACL tampering is a high-value detection target: the act of stripping audit ACEs is itself privileged.
8. SDDL: Security Descriptors as Text
A binary descriptor is awful to log, diff, or paste into a config file, so Windows defines the Security Descriptor Definition Language — a string form. The grammar is O: owner, G: group, D: DACL, S: SACL, each followed by flags and parenthesized ACEs:
O:BAG:SYD:(A;;FA;;;SY)(A;;FA;;;BA)(A;;0x1200a9;;;BU)S:(AU;SAFA;FA;;;WD)That single ACE (A;;GRGWGX;;;SY) reads as: Allow, no inherit flags, Generic Read/Write/eXecute, to SY (SYSTEM). Round-trip it with ConvertSecurityDescriptorToStringSecurityDescriptor and ConvertStringSecurityDescriptorToSecurityDescriptor. In practice you’ll read SDDL far more often through PowerShell:
$acl = Get-Acl C:\Windows\System32\config\SAM
$acl.Owner # owner principal
$acl.Sddl # full SDDL string
$acl.Access | Format-Table IdentityReference, FileSystemRights, AccessControlTypeicacls <path> gives the same data in a terser shorthand; Get-Acl is friendlier when you want the SDDL string itself for a baseline diff.
9. Inheritance and the Kernel Check
Child objects don’t usually carry hand-written ACLs. They inherit them. An ACE’s flags decide propagation: OBJECT_INHERIT_ACE (OI) pushes it onto leaf objects like files, CONTAINER_INHERIT_ACE (CI) onto sub-containers like folders or registry subkeys, and INHERIT_ONLY_ACE (IO) makes an ACE apply only to children and not the object carrying it. SE_DACL_PROTECTED blocks inheritance entirely — that’s what “disable inheritance” does in Explorer.
The decision itself happens in the kernel. Each OBJECT_HEADER carries a SecurityDescriptor field. At handle-creation time the Object Manager hands the token, the requested access, and the descriptor to the Security Reference Monitor (nt!SeAccessCheck), which walks the DACL and returns a granted-access mask. You can see the whole chain live in WinDbg:
kd> !process 0 0 lsass.exe
kd> !object <Object address>
kd> dt nt!_OBJECT_HEADER <header address> SecurityDescriptor
kd> !sd <SecurityDescriptor address & ~0xf> ; mask low bits, they're flags
kd> !token ; the token the check runs againstFiles, registry keys, processes, threads, named pipes, services, jobs — anything named and securable runs through this same path.
10. Common Attacker Techniques
SIDs and SDs aren’t just plumbing — they’re a manipulation target for evasion and escalation. The primitives below all leave traces (covered next), which is the point of teaching them.
| Technique | Description |
|---|---|
| NULL DACL planting | Set a present-but-NULL DACL on a service, registry key, or pipe to make it world-writable |
| DACL tampering for persistence | Add an explicit ACCESS_ALLOWED_ACE granting the attacker’s SID FullControl on a sensitive object |
| Owner abuse | Taking ownership of an object implicitly grants WRITE_DAC, letting an attacker rewrite the DACL afterward |
| SID-History injection | Write a privileged SID (e.g. a Domain Admins RID) into a controlled account’s sIDHistory so it lands in the token |
| SACL stripping | Remove audit ACEs from lsass.exe, SAM, or ntds.dit to suppress access logging before credential theft |
| Permission group discovery | Enumerate group SIDs and ACL members to plan lateral movement |
A populated sIDHistory on a non-migrated account is the canonical hunting signal for the injection case:
Get-ADUser -Filter * -Properties sIDHistory |
Where-Object { $_.sIDHistory } |
Select-Object Name, @{ n='sIDHistory'; e={ $_.sIDHistory -join ', ' } }In a domain with no active migration, any result here deserves investigation — especially a sIDHistory value ending in RID 512 or 519.

11. Detection, Hunting, and Hardening
DACL and SACL changes are logged by Windows itself, not Sysmon — you must enable the right Advanced Audit Policy subcategories first (Object Access → Audit File System / Audit Registry, and Policy Change → Audit Audit Policy Change).
| Event ID | Trigger | Hunt On |
|---|---|---|
4670 | Object permissions changed (DACL/Owner) | ObjectName, OldSd, NewSd, SubjectUserSid |
4907 | Object auditing (SACL) settings changed | Blank NewSd = SACL stripped |
4715 | Audit policy on an object changed | OriginalSecurityDescriptor, NewSecurityDescriptor |
4719 | System audit policy changed | SubjectUserSid, AuditPolicyChanges |
4663 | Object access attempt | Sudden gaps after a 4907 on LSASS = stripping |
4728/4732/4756 | Member added to privileged group | Correlate with SID manipulation |
The highest-fidelity signal is a 4907 that blanks the SACL on lsass.exe, ntds.dit, or the SAM hive — that’s pre-credential-dump preparation. Pair it with Sysmon Event ID 10 (process access to LSASS) and Event ID 1 watching for icacls.exe, cacls.exe, sc.exe sdset, and Set-Acl command lines. A Sigma sketch for DACL tampering on sensitive objects:
title: Suspicious DACL Modification on Sensitive Object
logsource:
product: windows
service: security
detection:
selection:
EventID: 4670
ObjectName|contains:
- '\lsass.exe'
- '\ntds.dit'
- '\SAM'
condition: selection
fields:
- SubjectUserSid
- ObjectName
- OldSd
- NewSd
level: highHardening, in rough priority order:
- Hunt NULL DACLs. Use
AccessChkto enumerate world-writable services, keys, and files; fix them. - Protect the LSASS SACL and alert on any
4907that empties it. - Enable SID Filtering on every trust to neutralize cross-domain
sIDHistoryabuse, and auditsIDHistoryon a schedule. - Restrict
SeSecurityPrivilegeto Administrators and watch for its use. - Prefer explicit DENY over absent ALLOW, and put privileged accounts in Protected Users.
MITRE ATT&CK Mapping
| Technique | MITRE ID | Detection |
|---|---|---|
| Access Token Manipulation | T1134 | Token/SID anomalies in logon events |
| SID-History Injection | T1134.005 | Non-empty sIDHistory on non-migrated accounts |
| File/Directory Permissions Modification | T1222.001 | 4670; icacls/SetNamedSecurityInfo in 4688 |
| Impair Defenses: Disable/Modify Tools | T1562.001 | 4907 blanking a SACL; 4663 gaps |
| Permission Groups Discovery | T1069.001 / .002 | Bulk SID/group enumeration |
12. Tools
| Tool | Description | Link |
|---|---|---|
| AccessChk | Dumps effective permissions and finds NULL/weak DACLs | learn.microsoft.com |
icacls | Built-in ACL viewer/editor with SDDL shorthand | (built-in) |
Get-Acl / Set-Acl | PowerShell SD read/write, exposes .Sddl | (built-in) |
| WinDbg | Kernel-side !sd, !token, OBJECT_HEADER inspection | learn.microsoft.com |
| Process Hacker | GUI view of token SIDs and object security | processhacker.sourceforge.io |
| WinObj | Browse Object Manager namespace and per-object security | learn.microsoft.com |
Summary
- A SID is the immutable, never-reused name Windows checks for every authorization decision — display names are cosmetic, SIDs are ground truth.
- The access token carries the user SID plus all group SIDs (including any from
sIDHistory), and the kernel compares those against an object’s DACL viant!SeAccessCheck. - The
SECURITY_DESCRIPTORbinds owner, group, DACL, and SACL; a present-but-NULL DACL silently grants everyone full access, while an empty DACL denies everyone. - SID-History injection (
T1134.005) and SACL stripping (T1562.001) are the two abuse primitives worth hunting hardest — watch4670,4907, and non-emptysIDHistory. - Enable Object Access and Policy Change auditing, restrict
SeSecurityPrivilege, enable SID Filtering on trusts, and baseline SDDL on sensitive objects so a tampered DACL stands out.
Related Tutorials
- Access Tokens and Privileges: The Kernel’s Security Context
- Fibers: User-Mode Cooperative Threads
- Jobs and Silos: Process Grouping and Resource Limits
- Windows Scheduler Internals: Priority Levels, Quantum, and Thread Selection
- Threat-Informed Defense: Principles, Frameworks, and the Intelligence-Driven Security Cycle
References
- Security Identifiers | Microsoft Learn (Windows Server)
- Security Identifiers – Win32 Apps | Microsoft Learn (Win32 API Reference)
- Security Descriptors – Win32 Apps | Microsoft Learn (Win32 API Reference)
- [MS-DTYP]: SECURITY_DESCRIPTOR | Microsoft Learn (Windows Open Specification)
- [MS-DTYP]: SID | Microsoft Learn (Windows Open Specification)
- Access Token Manipulation: SID-History Injection, Sub-technique T1134.005 | MITRE ATT&CK
Fibers: User-Mode Cooperative Threads
Objective: Understand the internals of Windows fibers — how they relate to the TEB, the undocumented
FIBERstructure, Fiber Local Storage, and the cooperative context switch performed entirely in user mode — so defenders can recognize and detect adversarial use of fiber APIs for stealthy in-process execution.
1. Cooperative vs. Preemptive Scheduling
A thread is the Windows kernel’s unit of execution. The scheduler picks ready threads, slices CPU time, and preempts them at quantum boundaries — all driven from ntoskrnl.exe. A fiber is different: it is a unit of execution that the kernel does not know about. Fibers run inside threads, and the application — not the OS — chooses when one fiber yields and another runs.
Two consequences follow immediately:
- A fiber switch never crosses the user/kernel boundary. No syscall is issued.
SwitchToFiberlives inKernelBase.dlland returns without touchingntoskrnl. - From the kernel’s perspective, all activity performed by a fiber is attributed to the thread that runs it. Accessing TLS from a fiber accesses the thread’s TLS, not a per-fiber slot.
This is the root of both the elegance and the security relevance of fibers: they are coroutines built directly into the Win32 ABI, with stack pivots and register saves the kernel cannot see.
2. The Fiber Execution Model
A fiber consists of three things: a stack, a saved CPU context (registers, instruction pointer, SEH frame), and a start routine that receives an opaque parameter. A thread becomes “fiber-aware” by calling ConvertThreadToFiber, at which point that thread is permanently a fiber host until it calls ConvertFiberToThread.
| Rule | Behavior |
|---|---|
| Must convert first | You cannot call SwitchToFiber from a thread until ConvertThreadToFiber runs. |
| Fiber function returning | If a fiber’s start routine returns, the host thread calls ExitThread and terminates. |
| Self-delete | If the currently running fiber calls DeleteFiber on itself, the host thread exits. |
| Cross-thread delete | Deleting a fiber that is the selected fiber of another thread will likely crash that thread — its stack just disappeared. |
| Cross-thread switch | SwitchToFiber accepts a fiber created by a different thread; the caller becomes the new host. |
These rules are load-bearing — most fiber bugs (and several known abuse primitives) come from violating them.
3. TEB Layout and the FIBER Structure
The Thread Environment Block (TEB) tracks the per-thread fiber state. Three fields matter:
| Field | Type | Role |
|---|---|---|
NtTib.FiberData | PVOID | Pointer to the current fiber’s FIBER structure |
HasFiberData | USHORT : 1 | Bitfield set by ConvertThreadToFiberEx; indicates the thread hosts fibers |
FlsData | PVOID | Pointer to the FLS slot array for the current fiber |
ConvertThreadToFiberEx calls NtCurrentTeb(), checks Teb->HasFiberData, and if the thread is already a fiber returns with ERROR_ALREADY_FIBER. Otherwise it allocates a FIBER structure on the process heap via RtlAllocateHeap and stores its address in NtTib.FiberData.
The FIBER struct itself is not officially documented. The shape below is reconstructed from ReactOS sources and public symbols and is subject to change across Windows versions:
// Reconstructed from public symbols / ReactOS — illustrative only.
typedef struct _FIBER {
PVOID FiberData; // lpParameter passed at creation
PVOID ExceptionList; // Top of SEH chain (NT_TIB.ExceptionList)
PVOID StackBase; // High end of the fiber stack
PVOID StackLimit; // Low end (guard page)
PVOID DeallocationStack; // Original VirtualAlloc base
CONTEXT FiberContext; // Saved CPU state: RIP, RSP, RBP, RBX, ...
ULONG FiberFlags; // FIBER_FLAG_FLOAT_SWITCH, etc.
PVOID ActivationContext; // Per-fiber activation context stack
PVOID FlsSlots; // Per-fiber FLS slot array
} FIBER, *PFIBER;You must never read or write this structure directly. The Win32 fiber functions manage its contents; treating the returned LPVOID as opaque is part of the contract.
4. The Core Fiber API
The full surface is small. Most of winbase.h and fibersapi.h boils down to these functions:
| Function | Purpose |
|---|---|
ConvertThreadToFiber | Promote the calling thread into a fiber; required first |
ConvertThreadToFiberEx | As above; accepts FIBER_FLAG_FLOAT_SWITCH |
CreateFiber | Allocate stack + FIBER struct; record entry point and parameter |
CreateFiberEx | As above; accepts dwStackCommitSize and flags |
SwitchToFiber | Cooperative context switch to the supplied fiber |
DeleteFiber | Free the fiber’s stack, context, and FIBER data |
ConvertFiberToThread | Demote back to a plain thread; required to avoid leaks |
GetCurrentFiber | Returns the current FIBER address (intrinsic — no CALL) |
GetFiberData | Returns the lpParameter value (intrinsic — no CALL) |
The exact CreateFiber signature, per MSDN:
LPVOID CreateFiber(
SIZE_T dwStackSize, // 0 = default, grows up to 1 MB
LPFIBER_START_ROUTINE lpStartAddress, // void StartRoutine(LPVOID lpParameter)
LPVOID lpParameter // passed to the fiber function
);GetCurrentFiber and GetFiberData are compiler intrinsics on MSVC — they inline directly to a gs:[0x20]/fs:[0x10] read of NtTib.FiberData. They produce no import thunk and no CALL instruction, which has direct consequences for IAT-based detection.
5. Fiber Lifecycle: A Minimal Example
This walks the canonical create → switch → yield → delete sequence. Note how g_mainFiber is the fiber identity of the original thread, returned by ConvertThreadToFiber.
#include <windows.h>
#include <stdio.h>
LPVOID g_mainFiber = NULL;
LPVOID g_workFiber = NULL;
VOID CALLBACK WorkerFiberProc(LPVOID lpParam) {
printf("[worker] running on fiber %p, param=%p\n",
GetCurrentFiber(), lpParam);
// Cooperative yield — control returns to the main fiber.
SwitchToFiber(g_mainFiber);
printf("[worker] resumed; returning will ExitThread()\n");
SwitchToFiber(g_mainFiber); // never let the routine return
}
int main(void) {
// Promote thread; TEB->HasFiberData becomes 1.
g_mainFiber = ConvertThreadToFiber(NULL);
// 64 KiB stack; entry = WorkerFiberProc; param = 0xDEADBEEF.
g_workFiber = CreateFiber(0x10000, WorkerFiberProc, (LPVOID)0xDEADBEEF);
SwitchToFiber(g_workFiber); // first run of worker
printf("[main] back from worker\n");
SwitchToFiber(g_workFiber); // resume worker
DeleteFiber(g_workFiber); // safe: not the running fiber
ConvertFiberToThread(); // demote; release fiber bookkeeping
return 0;
}Forgetting ConvertFiberToThread leaks the main fiber’s FIBER allocation on the process heap. Forgetting to yield back before the worker returns terminates the host thread via ExitThread.
6. Context Switching Internals
SwitchToFiber is the heart of the API. Conceptually, it performs:
- Save the current CPU state (
RBX,RBP,RDI,RSI,R12–R15,RSP,RIPon x64) into the current fiber’sFiberContext. - Save the SEH chain head (
NtTib.ExceptionList) and stack bounds (StackBase,StackLimit) into the currentFIBER. - If
FIBER_FLAG_FLOAT_SWITCHis set, save theXMM/MMX/x87state. - Update
NtTib.FiberDatato point at the targetFIBER. - Restore the target fiber’s stack bounds, SEH chain, FLS pointer, and CPU registers.
- Return to the saved instruction pointer of the target — execution resumes there on the target’s stack.
Critically, this is a pure user-mode operation. No syscall, no int 2e, no ETW event from Microsoft-Windows-Kernel-Process. The host thread’s kernel-visible state (KTHREAD, ETHREAD) is unchanged; only RIP/RSP move from the kernel’s view.
; Conceptual sketch — SwitchToFiber x64 prologue
mov gs:[0x20], rcx ; NtTib.FiberData = target
mov [rax + FiberContextOff + Rsp], rsp
mov [rax + FiberContextOff + Rip], <return addr>
; ... restore target ...
mov rsp, [rcx + FiberContextOff + Rsp]
jmp qword [rcx + FiberContextOff + Rip]
7. Fiber Local Storage (FLS)
TLS is per-thread. During a fiber switch the TEB’s TLS array is not swapped, so two fibers sharing a thread share TLS — a classic source of corruption when porting thread-based libraries to fibers. FLS solves this: it is per-fiber, and SwitchToFiber updates TEB->FlsData to the incoming fiber’s slot array.
| Function | Purpose |
|---|---|
FlsAlloc(PFLS_CALLBACK_FUNCTION) | Allocate an FLS index; optional destructor callback |
FlsSetValue(DWORD, PVOID) | Store a per-fiber value at the given index |
FlsGetValue(DWORD) | Read the current fiber’s value at the given index |
FlsFree(DWORD) | Release the index; callbacks fire for live fibers |
The destructor callback pointers are kept process-wide in PEB->FlsCallback. They fire on fiber deletion and thread exit, and — as covered below — they are a known abuse target.
DWORD g_flsIndex;
VOID WINAPI OnFlsDestroy(PVOID p) {
HeapFree(GetProcessHeap(), 0, p);
}
VOID CALLBACK FiberA(LPVOID _) {
char *buf = (char*)HeapAlloc(GetProcessHeap(), 0, 32);
lstrcpyA(buf, "fiber-A-private");
FlsSetValue(g_flsIndex, buf);
SwitchToFiber(g_mainFiber);
printf("[A] still mine: %s\n", (char*)FlsGetValue(g_flsIndex));
SwitchToFiber(g_mainFiber);
}
int wmain(void) {
g_mainFiber = ConvertThreadToFiber(NULL);
g_flsIndex = FlsAlloc(OnFlsDestroy);
// ... create FiberA, FiberB, switch between them ...
// Each fiber sees its own FlsGetValue(g_flsIndex) result.
}
8. Building a Round-Robin Cooperative Scheduler
Fibers shine when modeling cooperative pipelines: parsers, generators, state machines. A trivial scheduler is a dispatcher fiber that round-robins through worker fibers, each of which yields back via SwitchToFiber(g_mainFiber).
#define N 3
LPVOID g_workers[N];
LPVOID g_mainFiber;
VOID CALLBACK Worker(LPVOID id) {
for (int i = 0; i < 4; ++i) {
printf("[worker %llu] step %d\n", (ULONG_PTR)id, i);
SwitchToFiber(g_mainFiber); // yield
}
// Final yield — never return from a fiber routine.
SwitchToFiber(g_mainFiber);
}
int main(void) {
g_mainFiber = ConvertThreadToFiber(NULL);
for (ULONG_PTR i = 0; i < N; ++i)
g_workers[i] = CreateFiber(0, Worker, (LPVOID)i);
for (int round = 0; round < 4; ++round)
for (int i = 0; i < N; ++i)
SwitchToFiber(g_workers[i]);
for (int i = 0; i < N; ++i) DeleteFiber(g_workers[i]);
ConvertFiberToThread();
return 0;
}This is the same pattern Microsoft SQL Server used for its historical “lightweight pooling” / fiber mode — one OS thread, many SQL user contexts.
9. Legitimate Use Cases and Pitfalls
| Use Case | Reason |
|---|---|
| Coroutines / generators | Native stack switching with no setjmp tricks |
| Porting cooperative legacy code | UNIX swapcontext-style schedulers map cleanly |
| Database engines | SQL Server fiber mode for high-concurrency workloads |
| Game engines / scripting hosts | Per-script execution context with explicit yield |
Pitfalls are sharp:
- COM is apartment-affinitive to threads, not fibers. Initializing COM on one fiber and using it from another corrupts COM bookkeeping.
- CRT and many MS libraries stash state in TLS. Switching fibers leaves that state behind, producing subtle corruption.
- Critical sections record the thread as the owner — a different fiber on the same thread re-enters without blocking.
- Stack-cookies and
__try/__exceptrely on SEH chain integrity;SwitchToFiberhandles this, but rawRtlInstallFunctionTableCallbackon a fiber stack must use the fiber’sStackBase/StackLimit.
10. Common Attacker Techniques
Fibers are attractive to adversaries because the entire execution primitive lives in user mode — no NtCreateThread, no CreateRemoteThread, no kernel ETW event for the act of switching execution. The patterns below are documented in public threat-research literature; described conceptually here for detection engineers.
| Technique | Description |
|---|---|
In-process shellcode via SwitchToFiber | Allocate PAGE_EXECUTE_READWRITE memory, copy a payload, call ConvertThreadToFiber then CreateFiber with the payload as lpStartAddress, then SwitchToFiber — execution begins with no new thread |
| Fiber-based ROP staging | A fiber’s saved CONTEXT includes RIP and RSP; manipulating a FIBER struct’s context fields lets an attacker pivot the stack on SwitchToFiber |
PEB->FlsCallback overwrite | Overwrite an entry in the process-wide FLS callback array; on the next FlsFree or fiber/thread teardown the attacker-controlled pointer is invoked with attacker-controlled data |
| TLS evasion via FLS | Hide per-task state in FLS slots that defensive tooling enumerating TLS will miss |
| API hiding via intrinsics | GetCurrentFiber/GetFiberData produce no IAT entry; static analysis missing gs:[0x20] reads will not see fiber-aware code |
The base ATT&CK parent for fiber-based in-process execution is T1055 Process Injection; MITRE has not assigned a fiber-specific sub-technique, so the closest analogue is T1055.004 (APC) which shares the “queue execution to a thread’s user-mode context” model.
11. Defensive Strategies & Detection
There is no kernel event for SwitchToFiber. Detection must focus on the setup that precedes fiber-based execution (RWX allocation, suspicious entry points) and on memory forensics of fiber state at rest.
Sysmon coverage for the surrounding behavior:
| Event ID | Signal |
|---|---|
1 | Process Create — establish baseline lineage |
8 | CreateRemoteThread — co-occurs with cross-process fiber staging |
10 | ProcessAccess — reflective loaders reading remote memory before fiber dispatch |
17/18 | Named-pipe create/connect — common multi-stage loader IPC |
25 | ProcessTampering — image-region tampering in a fiber host |
ETW providers worth subscribing:
Microsoft-Windows-Threat-Intelligence— flagsVirtualAlloc/VirtualProtectwithPAGE_EXECUTE_*, the precursor to fiber shellcode staging.Microsoft-Windows-Kernel-Process— does not see fiber switches but covers process/thread lifecycle.- A user-mode consumer hooking
NtAllocateVirtualMemory+NtProtectVirtualMemorygives the strongest pre-execution signal.
Memory forensics indicators:
- Walk
TEB.NtTib.FiberDataon every thread. Threads withHasFiberData == 1in processes that have no business using fibers are immediately interesting. - Use Volatility
malfindto surface private, executable, non-image-backed pages — the target of a fiber-staged payload. - Dump
PEB->FlsCallbackand verify every entry resolves to an expected module’s.textsection.
Sigma sketch for the cross-process precursor to fiber-based payload staging:
title: Suspicious ProcessAccess Preceding User-Mode Fiber Execution
id: 8f5c1d6e-3c7b-4b1f-9e1e-7e3e6e2b0a1f
logsource:
product: windows
service: sysmon
detection:
selection:
EventID: 10
GrantedAccess:
- '0x1fffff' # PROCESS_ALL_ACCESS
- '0x1f0fff'
TargetImage|endswith:
- '\explorer.exe'
- '\svchost.exe'
filter_legit:
SourceImage|endswith:
- '\MsMpEng.exe'
- '\SenseIR.exe'
condition: selection and not filter_legit
level: high
tags:
- attack.t1055
- attack.t1106Hardening:
SetProcessMitigationPolicywithProcessDynamicCodePolicy(Arbitrary Code Guard) blocks creation of new executable pages, defeating fiber shellcode staging.- Control Flow Guard restricts indirect-call targets, narrowing
SwitchToFiberand FLS-callback abuse to valid entry points. - HVCI / memory integrity prevents kernel-side tampering of
FIBERstructures via vulnerable drivers. - WDAC / AppLocker policies that deny
PAGE_EXECUTE_*allocations on non-JIT processes raise the cost of any in-process execution primitive.

12. Tools for Fiber Analysis
| Tool | Description | Link |
|---|---|---|
| WinDbg | Dump TEB, walk NtTib.FiberData, inspect FIBER.FiberContext | microsoft.com |
| Process Hacker | Enumerate threads, inspect TEB, examine private RWX regions | processhacker.sf.io |
| Process Monitor | Capture VirtualAlloc/VirtualProtect sequences preceding fiber dispatch | sysinternals.com |
| Volatility 3 | windows.malfind, TEB plugins, FLS callback inspection | volatilityfoundation.org |
| pykd / WinDbg JS | Scripted walks of FIBER chains across all threads | githomelab.ru/pykd |
| x64dbg | User-mode debugging of fiber-aware binaries; trace gs:[0x20] reads | x64dbg.com |
| Ghidra | Static analysis; recognize GetCurrentFiber intrinsic pattern | ghidra-sre.org |
| Sysmon | Surrounding telemetry (Events 1, 8, 10, 25) | sysinternals.com |
A minimal WinDbg recipe to surface fiber-hosting threads in a captured process:
0:000> !teb
TEB at 000000abcd123000
...
NtTib.FiberData: 0000020fabcde000
...
0:000> dt ntdll!_TEB @$teb HasFiberData
0:000> dq 0000020fabcde000 L40 ; raw FIBER bytes — layout version-dependent13. MITRE ATT&CK Mapping
| Technique | MITRE ID | Detection |
|---|---|---|
| Process Injection | T1055 | Memory scan for private RWX regions; ETW TI on NtAllocateVirtualMemory |
| Process Injection: Asynchronous Procedure Call | T1055.004 | Closest published sub-technique to fiber-based in-process execution |
| Native API | T1106 | API-call auditing of CreateFiber/SwitchToFiber/FlsAlloc |
| Reflective Code Loading | T1620 | Image-load anomalies; fiber entry point in non-image-backed memory |
| Impair Defenses: Disable or Modify Tools | T1562.001 | ETW/AMSI hook integrity checks; user-mode hook auditing |
MITRE ATT&CK does not currently list a “Fiber Injection” sub-technique (current as of v16.1). Vendor research treats fiber-based execution as a variant of
T1055; map accordingly.
Summary
- A fiber is a user-mode cooperative thread invisible to the kernel scheduler —
SwitchToFiberperforms a stack and register swap entirely inKernelBase.dllwith no syscall. - The TEB exposes the fiber state via
NtTib.FiberData,HasFiberData, andFlsData; theFIBERstructure itself is undocumented and version-dependent. - TLS is per-thread and is not swapped on a fiber switch; FLS is per-fiber and is swapped, with destructor callbacks tracked in
PEB->FlsCallback. - Adversaries abuse fibers for in-process shellcode execution, ROP staging via the saved
CONTEXT, and code execution viaPEB->FlsCallbackoverwrites — none of which trigger thread-creation telemetry. - Detect via pre-execution signals (ETW TI on RWX allocations, Sysmon Event IDs
8/10/25), memory forensics on private executable regions andFlsCallbackintegrity, and hardening with ACG, CFG, and HVCI.
Related Tutorials
- System Calls and SSDT: How User Mode Reaches the Kernel
- User Mode vs Kernel Mode: Privilege Rings and the Boundary
- Threads and the TEB (Thread Environment Block)
- Access Tokens and Privileges: The Kernel’s Security Context
- SIDs and Security Descriptors: Identity in Windows Security
References
- Fibers – Win32 apps | Microsoft Learn
- Using Fibers – Win32 apps | Microsoft Learn
- CreateFiber function (winbase.h) – Win32 apps | Microsoft Learn
- ConvertThreadToFiber function (winbase.h) – Win32 apps | Microsoft Learn
- Process Injection, Technique T1055 – Enterprise | MITRE ATT&CK®
- About Processes and Threads – Win32 apps | Microsoft Learn
Jobs and Silos: Process Grouping and Resource Limits
Objective: Understand how the Windows kernel uses Job Objects and Silo Objects to group processes, enforce CPU/memory/network limits, and provide the namespace isolation that underpins Windows containers — and how defenders detect and harden against their abuse.
1. What Is a Job Object?
A job object lets a group of processes be managed as a single unit. It is a namable, securable, sharable kernel object that controls attributes of every process associated with it; operations on the job — limits, termination, accounting — apply to all member processes at once.
In the kernel the object is the undocumented executive type EJOB, allocated from kernel pool. Each process control block carries an EPROCESS.Job pointer linking it to its owning job. User mode never touches EJOB directly; it operates through a handle returned by CreateJobObject.
Before Windows 8 / Windows Server 2012, a process could belong to one job and jobs could not be nested. Windows 8 introduced nested jobs, allowing a process to participate in a hierarchy where the effective limit is the most restrictive ancestor.
| Object Type | Description |
|---|---|
EJOB | Kernel job object; groups processes, holds limits and accounting |
EPROCESS.Job | Per-process pointer to its owning job |
| Named job | Job published under \Sessions\<N>\BaseNamedObjects\, openable by name |
| Anonymous job | Handle-only job, no namespace entry, shared by duplication/inheritance |

2. Core Job Object APIs
The job lifecycle is driven by a small, stable Win32 surface.
| Function | Purpose |
|---|---|
CreateJobObject | Create, or open if named, a job object |
OpenJobObject | Open an existing named job |
AssignProcessToJobObject | Add a process to a job |
SetInformationJobObject | Apply limits and policy to the job |
QueryInformationJobObject | Read limits, accounting, and peak usage |
TerminateJobObject | Kill every process in the job |
IsProcessInJob | Test whether a process already belongs to a job |
HANDLE CreateJobObject(LPSECURITY_ATTRIBUTES lpJobAttributes, LPCWSTR lpName);
BOOL AssignProcessToJobObject(HANDLE hJob, HANDLE hProcess);
BOOL SetInformationJobObject(HANDLE hJob, JOBOBJECTINFOCLASS JobObjectInformationClass,
LPVOID lpJobObjectInformation, DWORD cbJobObjectInformationLength);
BOOL QueryInformationJobObject(HANDLE hJob, JOBOBJECTINFOCLASS JobObjectInformationClass,
LPVOID lpJobObjectInformation, DWORD cbJobObjectInformationLength,
LPDWORD lpReturnLength);
BOOL TerminateJobObject(HANDLE hJob, UINT uExitCode);3. Basic Limits: CPU, Memory, and Process Count
JOBOBJECT_BASIC_LIMIT_INFORMATION carries the foundational controls.
typedef struct _JOBOBJECT_BASIC_LIMIT_INFORMATION {
LARGE_INTEGER PerProcessUserTimeLimit;
LARGE_INTEGER PerJobUserTimeLimit;
DWORD LimitFlags;
SIZE_T MinimumWorkingSetSize;
SIZE_T MaximumWorkingSetSize;
DWORD ActiveProcessLimit;
ULONG_PTR Affinity;
DWORD PriorityClass;
DWORD SchedulingClass;
} JOBOBJECT_BASIC_LIMIT_INFORMATION;The LimitFlags bitmask selects which fields the kernel enforces.
| Limit Flag | Description |
|---|---|
JOB_OBJECT_LIMIT_PROCESS_TIME | Per-process user-mode CPU cap (100 ns ticks); process killed when exceeded |
JOB_OBJECT_LIMIT_JOB_TIME | Job-wide CPU time cap |
JOB_OBJECT_LIMIT_WORKINGSET | Min/max working set per process |
JOB_OBJECT_LIMIT_ACTIVE_PROCESS | Caps active process count; over-limit assignment terminates the process |
JOB_OBJECT_LIMIT_AFFINITY | Forces a processor affinity mask |
JOB_OBJECT_LIMIT_KILL_ON_JOB_CLOSE | Kills all processes when the last job handle closes |
JOB_OBJECT_LIMIT_KILL_ON_JOB_CLOSE is the cornerstone of any sandbox: if the controlling process dies, the entire tree is reaped, leaving no orphaned children.
#include <windows.h>
int main(void) {
HANDLE hJob = CreateJobObject(NULL, L"Sandbox_Demo"); // named for observability
if (!hJob) return GetLastError();
JOBOBJECT_EXTENDED_LIMIT_INFORMATION eli = { 0 };
eli.BasicLimitInformation.LimitFlags =
JOB_OBJECT_LIMIT_KILL_ON_JOB_CLOSE | // tear down tree on handle loss
JOB_OBJECT_LIMIT_ACTIVE_PROCESS; // bound process count
eli.BasicLimitInformation.ActiveProcessLimit = 4;
SetInformationJobObject(hJob, JobObjectExtendedLimitInformation, &eli, sizeof(eli));
STARTUPINFO si = { sizeof(si) };
PROCESS_INFORMATION pi = { 0 };
// Create suspended so we can assign before any code runs
CreateProcess(L"C:\\Windows\\System32\\notepad.exe", NULL, NULL, NULL,
FALSE, CREATE_SUSPENDED, NULL, NULL, &si, &pi);
AssignProcessToJobObject(hJob, pi.hProcess);
ResumeThread(pi.hThread);
CloseHandle(pi.hThread);
CloseHandle(pi.hProcess);
CloseHandle(hJob); // KILL_ON_JOB_CLOSE terminates notepad here
return 0;
}4. Extended and Rate Limits
JOBOBJECT_EXTENDED_LIMIT_INFORMATION embeds the basic structure as BasicLimitInformation and adds memory governance: ProcessMemoryLimit (per-process commit, needs JOB_OBJECT_LIMIT_PROCESS_MEMORY), JobMemoryLimit (job-wide commit, needs JOB_OBJECT_LIMIT_JOB_MEMORY), and the continuously tracked PeakProcessMemoryUsed / PeakJobMemoryUsed. The two memory limits are independent — a 100 MB job-wide cap can coexist with a 10 MB per-process cap.
JOBOBJECT_EXTENDED_LIMIT_INFORMATION eli = { 0 };
eli.BasicLimitInformation.LimitFlags =
JOB_OBJECT_LIMIT_PROCESS_MEMORY | JOB_OBJECT_LIMIT_JOB_MEMORY;
eli.ProcessMemoryLimit = 10 * 1024 * 1024; // 10 MB per process
eli.JobMemoryLimit = 100 * 1024 * 1024; // 100 MB job-wide (independent)
SetInformationJobObject(hJob, JobObjectExtendedLimitInformation, &eli, sizeof(eli));
DWORD ret = 0;
QueryInformationJobObject(hJob, JobObjectExtendedLimitInformation, &eli, sizeof(eli), &ret);
printf("PeakJobMemoryUsed: %zu bytes\n", eli.PeakJobMemoryUsed);CPU throttling uses JOBOBJECT_CPU_RATE_CONTROL_INFORMATION.
typedef struct _JOBOBJECT_CPU_RATE_CONTROL_INFORMATION {
DWORD ControlFlags;
union {
DWORD CpuRate;
DWORD Weight;
struct { WORD MinRate; WORD MaxRate; } DUMMYSTRUCTNAME;
} DUMMYUNIONNAME;
} JOBOBJECT_CPU_RATE_CONTROL_INFORMATION;| Control Flag | Value | Behaviour |
|---|---|---|
JOB_OBJECT_CPU_RATE_CONTROL_ENABLE | 0x1 | Enables CPU rate control |
JOB_OBJECT_CPU_RATE_CONTROL_WEIGHT_BASED | 0x2 | Rate derived from relative weight vs. other jobs |
JOB_OBJECT_CPU_RATE_CONTROL_HARD_CAP | 0x4 | Hard cap; no job threads run after the budget is spent until next interval |
JOB_OBJECT_CPU_RATE_CONTROL_NOTIFY | 0x8 | Notifies when the rate limit is exceeded |
JOBOBJECT_CPU_RATE_CONTROL_INFORMATION cpu = { 0 };
cpu.ControlFlags = JOB_OBJECT_CPU_RATE_CONTROL_ENABLE |
JOB_OBJECT_CPU_RATE_CONTROL_HARD_CAP;
cpu.CpuRate = 2000; // 20.00% of one CPU (units of 1/100 percent)
// Windows containers (non-Hyper-V) use weight-based control instead:
// cpu.ControlFlags = JOB_OBJECT_CPU_RATE_CONTROL_ENABLE |
// JOB_OBJECT_CPU_RATE_CONTROL_WEIGHT_BASED;
// cpu.Weight = 5; // relative scheduling weight
SetInformationJobObject(hJob, JobObjectCpuRateControlInformation, &cpu, sizeof(cpu));Network bandwidth is bounded with JOBOBJECT_NET_RATE_CONTROL_INFORMATION, which sets MaxBandwidth (outgoing bytes), a DscpTag, and ControlFlags for scheduling policy.
5. Notification Limits and I/O Completion Ports
Not every limit should kill. JOBOBJECT_NOTIFICATION_LIMIT_INFORMATION defines soft limits that alert without termination, covering IoReadBytesLimit, IoWriteBytesLimit, per-job user time, and job memory. To receive these alerts, associate an I/O completion port via JOBOBJECT_ASSOCIATE_COMPLETION_PORT.
| Completion Message | Meaning |
|---|---|
JOB_OBJECT_MSG_NEW_PROCESS | A process was added to the job |
JOB_OBJECT_MSG_EXIT_PROCESS | A member process exited |
JOB_OBJECT_MSG_ACTIVE_PROCESS_ZERO | Job is now empty |
JOB_OBJECT_MSG_JOB_MEMORY_LIMIT | Job-wide commit limit was hit |
HANDLE hPort = CreateIoCompletionPort(INVALID_HANDLE_VALUE, NULL, 0, 1);
JOBOBJECT_ASSOCIATE_COMPLETION_PORT acp = { 0 };
acp.CompletionKey = hJob; // echoed back as the key
acp.CompletionPort = hPort;
SetInformationJobObject(hJob, JobObjectAssociateCompletionPortInformation, &acp, sizeof(acp));
DWORD msg; ULONG_PTR key; LPOVERLAPPED ov;
while (GetQueuedCompletionStatus(hPort, &msg, &key, &ov, INFINITE)) {
switch (msg) {
case JOB_OBJECT_MSG_NEW_PROCESS: /* child started */ break;
case JOB_OBJECT_MSG_JOB_MEMORY_LIMIT: /* commit cap hit */ break;
case JOB_OBJECT_MSG_ACTIVE_PROCESS_ZERO: return 0; // job empty
}
}6. Nested Jobs
On Windows 8 and later, assigning an already-jobbed process to a second job nests it. The kernel computes the effective limit as the minimum of the chain — a child job can only tighten, never loosen, an ancestor’s constraint.
// Parent job: 200 MB job-wide commit
HANDLE hParent = CreateJobObject(NULL, NULL);
JOBOBJECT_EXTENDED_LIMIT_INFORMATION p = { 0 };
p.BasicLimitInformation.LimitFlags = JOB_OBJECT_LIMIT_JOB_MEMORY;
p.JobMemoryLimit = 200 * 1024 * 1024;
SetInformationJobObject(hParent, JobObjectExtendedLimitInformation, &p, sizeof(p));
AssignProcessToJobObject(hParent, hProc);
// Child job nested under parent: 100 MB
HANDLE hChild = CreateJobObject(NULL, NULL);
JOBOBJECT_EXTENDED_LIMIT_INFORMATION c = { 0 };
c.BasicLimitInformation.LimitFlags = JOB_OBJECT_LIMIT_JOB_MEMORY;
c.JobMemoryLimit = 100 * 1024 * 1024;
SetInformationJobObject(hChild, JobObjectExtendedLimitInformation, &c, sizeof(c));
AssignProcessToJobObject(hChild, hProc); // Win8+ nests automatically
// Effective limit on hProc = min(200 MB, 100 MB) = 100 MBFor pre-Windows 8 compatibility, test membership first — assigning a jobbed process there is fatal.
BOOL inJob = FALSE;
IsProcessInJob(hProc, NULL, &inJob); // NULL JobHandle = "any job"
if (inJob) {
// Windows 7: cannot reassign (no nesting). Windows 8+: assignment nests.
}
AssignProcessToJobObject(hJob, hProc);
7. Inspecting Jobs at Runtime
Process Explorer and Process Hacker display a process’s job membership and its limits on a dedicated Job tab. WinObj reveals named job objects in the Object Manager namespace. In kernel debugging, walk and dump jobs directly.
0: kd> !process 0 0 notepad.exe ; find the EPROCESS
0: kd> dt nt!_EPROCESS Job <EPROCESS> ; read the Job pointer
0: kd> !job <EJOB-address> ; dump limits and member list
0: kd> dt nt!_EJOB JobFlags ; locate the silo/flags fieldThese are observation tools, not attack tooling — they let an analyst confirm exactly which processes share a job and what limits are in force.
8. Silos: From Jobs to Containers
Jobs alone do not isolate the namespace — they constrain resources but not what a process can name or see. Microsoft solved this with silos, effectively “super jobs.” A silo is a job object with the Silo flag set in the EJOB.JobFlags field.
There are two silo types:
| Silo Type | Use | Privilege |
|---|---|---|
| Application silo | Desktop Bridge / MSIX app isolation | Standard |
| Server silo | Windows (Docker) container support | Administrator |
When a silo is created, the kernel builds it its own root directory object, distinct from the host root — giving the silo a private object namespace. A server silo further owns an _ESERVERSILO_GLOBALS structure holding container-specific state, and is backed by a virtual disk, a registry hive, and a virtual network adapter.
| Kernel Function | Purpose |
|---|---|
PsCreateSilo / PsCreateServerSilo | Create silo / server silo objects |
PsAttachSiloToCurrentThread / PsDetachSiloFromCurrentThread | Bind/unbind a thread to a silo context |
PsGetThreadServerSilo | Return the server silo a thread runs in |
PsIsCurrentThreadInServerSilo | Boolean gate used to restrict syscalls inside a container |
; For understanding only — JobFlags layout is build-specific and undocumented.
0: kd> dt nt!_EJOB JobFlags
+0x0?? JobFlags : Uint4B ; a bit in this field marks the job as a siloThe
_EJOB,_ESERVERSILO_GLOBALS, andJobFlagsoffsets are undocumented and shift between OS builds. Validate them against your target build with WinDbgdtbefore treating any offset as authoritative.

9. Windows Containers and the Host Compute Service
Windows Server containers are built on server silos. The Host Compute Service (HCS) orchestrates their lifecycle, wiring up the silo’s job-object resource controls, registry hive virtualization, and filesystem isolation. The filesystem layer is enforced by wcifs.sys, the Windows Container Isolation Filter Driver, which projects the container’s view over the host volume.
| Mode | Boundary | Notes |
|---|---|---|
--isolation=process | Server silo, shared host kernel | Lighter, but escapes reach the host kernel |
--isolation=hyperv | Utility VM + inner job object | VM enforces limits even if the inner job is escaped |
Process isolation shares the host kernel, which makes server-silo escape research directly relevant to defenders. Hyper-V isolation applies controls at both the VM and the inner container job object — a job escape still cannot exceed VM-level limits.

10. Common Attacker Techniques
| Technique | Description |
|---|---|
| Sandbox-aware keying | Payload detects a constrained job (low ActiveProcessLimit, tight memory cap) and alters behaviour to evade analysis |
| Debugger / UI blocking | Setting JOB_OBJECT_UILIMIT_HANDLES or JOB_OBJECT_UILIMIT_EXITWINDOWS to deny security-tool UI/handle access within the job |
| Breakaway abuse | Using JOB_OBJECT_LIMIT_BREAKAWAY_OK so child processes escape a controlling job’s limits and accounting |
| Child-tree concealment | Wrapping persistent processes in a job to manage and hide their descendant trees |
| Container / silo escape | Breaking out of a server silo’s namespace root to reach the host OS |
Adversaries also use the native API directly — CreateJobObject, AssignProcessToJobObject, SetInformationJobObject — to construct their own sandboxes around tooling, or to apply quotas that frustrate dynamic analysis.
11. Defensive Strategies & Detection
There is no dedicated Sysmon event for CreateJobObject or AssignProcessToJobObject as of Sysmon v15 — job manipulation is caught indirectly via process access, process creation, and ETW.
| Sysmon Event ID | Relevance |
|---|---|
1 (Process Create) | Children spawned under sandboxed jobs; correlate unusual ParentImage / IntegrityLevel |
10 (Process Access) | OpenProcess with PROCESS_SET_QUOTA (0x200) or PROCESS_ALL_ACCESS (0x1fffff) preceding job assignment |
17 / 18 (Pipe Created/Connected) | Named pipes visible across a silo namespace boundary during lateral movement |
| ETW Provider | What It Logs |
|---|---|
Microsoft-Windows-Kernel-Process | Process/thread lifecycle; job assignments surface as ProcessSetJobObjectInformation events |
Microsoft-Windows-Security-Auditing | Process creation (Event 4688 with command-line auditing) |
Microsoft-Windows-Containers-CCG | Container credential guard events in server silos |
Microsoft-Windows-Hyper-V-Compute | HCS / silo creation and teardown |
Enable Audit Process Creation (auditpol /set /subcategory:"Process Creation" /success:enable) to produce Event 4688 with full command line, and Audit Object Access to capture named job-object handle creation as Events 4656 / 4663.
title: Suspicious Process Access Preceding Job Quota Assignment
logsource:
product: windows
service: sysmon
detection:
selection:
EventID: 10 # Sysmon ProcessAccess
GrantedAccess|contains:
- '0x1fffff' # PROCESS_ALL_ACCESS
- '0x200' # PROCESS_SET_QUOTA (job assignment)
TargetImage|contains: '\lsass.exe'
condition: selection
level: highHardening guidance:
- Apply
JOB_OBJECT_LIMIT_KILL_ON_JOB_CLOSEin every sandbox so process trees are reaped on handle loss. - Deny
JOB_OBJECT_LIMIT_BREAKAWAY_OKunless explicitly required — it is a direct escape vector. - Combine job limits with Integrity Levels and AppContainer; jobs do not restrict file or registry access.
- For hostile workloads prefer Hyper-V isolation — controls apply to both the VM and the inner job object.
- Monitor
wcifs.sysactivity in server-silo environments; it enforces filesystem isolation and is a known escape surface. - Audit named job creation under
\Sessions\<N>\BaseNamedObjects\with WinObj and Sysmon object/pipe events as a proxy.
12. MITRE ATT&CK Mapping
| Technique | MITRE ID | Detection |
|---|---|---|
| Native API | T1106 | ETW Kernel-Process job-assignment events; underpins all job/silo API use |
| Process Injection | T1055 | Sysmon Event ID 10; handle access to constrained process groups |
| Impair Defenses: Disable/Modify Tools | T1562.001 | UI-limit flags blocking security tooling; behavioural EDR telemetry |
| Escape to Host | T1611 | wcifs.sys and Hyper-V-Compute ETW; primary silo/container-escape mapping |
| Create or Modify System Process | T1543 | Sysmon Event ID 1; persistent processes wrapped in jobs |
| Execution Guardrails | T1480 | Behavioural analysis of sandbox-aware payloads keyed to job limits |
Verify current technique versions and sub-techniques at https://attack.mitre.org before publication.
13. Tools for Job and Silo Analysis
| Tool | Description | Link |
|---|---|---|
| Process Explorer | View per-process job membership and limits | sysinternals |
| Process Hacker | Inspect job tab, member processes, and quotas | processhacker.sourceforge.io |
| WinObj | Browse named job objects and silo namespace roots | sysinternals |
| WinDbg | !job, dt nt!_EJOB, _ESERVERSILO_GLOBALS inspection | microsoft.com |
| Process Monitor | Observe wcifs.sys and registry-hive container activity | sysinternals |
| ETW (logman / wevtutil) | Capture Kernel-Process and Hyper-V-Compute events | microsoft.com |
Summary
- Job objects group processes into a single managed unit with enforceable CPU, memory, network, and process-count limits, all anchored on the kernel
EJOBobject. - Limits are applied through
SetInformationJobObjectusingJOBOBJECT_BASIC,EXTENDED,CPU_RATE,NET_RATE, andNOTIFICATIONstructures; nesting (Windows 8+) tightens to the most restrictive ancestor. - Silos extend jobs via the
JobFlagssilo bit, adding a private object-namespace root; server silos (_ESERVERSILO_GLOBALS) back Windows containers and share the host kernel. - Abuse spans sandbox-aware keying,
BREAKAWAY_OKescapes, UI-limit tool blocking, and server-silo container escape (T1611). - Detect via Sysmon
Event ID 1/10,Kernel-ProcessandHyper-V-ComputeETW, Event4688auditing, and prefer Hyper-V isolation plusKILL_ON_JOB_CLOSEfor containment.
Related Tutorials
- Windows Process Creation Internals & PEB
- Windows Boot Process
- Access Tokens and Privileges: The Kernel’s Security Context
- SIDs and Security Descriptors: Identity in Windows Security
- Fibers: User-Mode Cooperative Threads
References
- Job Objects – Win32 apps | Microsoft Learn
- JOBOBJECT_BASIC_LIMIT_INFORMATION (winnt.h) – Win32 apps | Microsoft Learn
- Nested Jobs – Win32 apps | Microsoft Learn
- Implementing Resource Controls (Windows Containers) | Microsoft Learn
- Reversing Windows Container, Episode I: Silo – Quarkslab’s Blog
- What I Learned from Reverse Engineering Windows Containers – Palo Alto Networks Unit 42
Windows Scheduler Internals: Priority Levels, Quantum, and Thread Selection
Objective: Understand how the Windows kernel selects, preempts, and rotates threads — the 32-level priority model, dispatcher ready queues, quantum accounting, boost/decay logic, and the multiprocessor dispatch path — so defenders can baseline normal scheduling behavior and detect attacker manipulation of priority and affinity.
1. The Scheduling Contract: Threads, Not Processes
Windows schedules threads, not processes. Every executable unit of work is represented by a KTHREAD (the Thread Control Block embedded in ETHREAD.Tcb), and the scheduler operates exclusively against that structure. A process supplies the address space, the base priority class, the quantum reset value, and the affinity mask — but it never itself runs on a CPU.
Scheduling is preemptive and priority-based with round-robin at the highest priority. Two rules dominate:
- The thread with the highest priority in the Ready state always wins.
- If a running thread has a lower priority than a newly Ready thread, the running thread is immediately preempted at the next dispatch point.
Quantum only matters as a tiebreaker between threads of the same highest priority — it does not arbitrate across priority levels.
2. The 32-Level Priority Model and Priority Classes
Priorities range from 0 (zero-page thread only) to 31 (highest real-time). The space splits into two bands with very different semantics.
| Range | Type | Description |
|---|---|---|
0 | Zero-page thread | Reserved for the memory zero-page thread |
1–15 | Dynamic (variable) | Normal user-mode threads; subject to boost/decay |
16–31 | Real-time | Fixed priorities; no boost, no decay; drivers and RT tasks |
Win32 exposes two functions to set scheduling parameters: SetPriorityClass on the process and SetThreadPriority on the thread. The two combine to produce the thread’s base priority in the kernel.
SetPriorityClass constant | Class | Base priority range |
|---|---|---|
IDLE_PRIORITY_CLASS | Idle | 1–6 |
BELOW_NORMAL_PRIORITY_CLASS | Below Normal | 4–9 |
NORMAL_PRIORITY_CLASS | Normal | 6–10 |
ABOVE_NORMAL_PRIORITY_CLASS | Above Normal | 8–13 |
HIGH_PRIORITY_CLASS | High | 11–15 |
REALTIME_PRIORITY_CLASS | Real-time | 16–31 |
Crossing into the real-time band (>=16) requires the SeIncreaseBasePriorityPrivilege privilege. NT-native equivalents are NtSetInformationThread (information class ThreadBasePriority = 3) and ZwSetInformationProcess.
// Pin this process and one of its threads to real-time scheduling.
SetPriorityClass(GetCurrentProcess(), REALTIME_PRIORITY_CLASS);
SetThreadPriority(GetCurrentThread(), THREAD_PRIORITY_TIME_CRITICAL);
// Base priority now sits at 31 — preempts essentially everything in user mode.
3. Key Kernel Structures
Three structures carry the scheduler’s state: _KTHREAD per thread, _KPROCESS per process, and _KPRCB per logical processor.
_KTHREAD (Thread Control Block)
typedef struct _KTHREAD {
DISPATCHER_HEADER Header; // +0x000 dispatcher object header
// ...
ULONGLONG QuantumTarget; // +0x020 quantum expiration target
PVOID InitialStack; // +0x028 top of kernel stack
// ...
volatile UCHAR State; // Ready/Running/Waiting/...
BOOLEAN Preempted;
UCHAR DeferredProcessor;
SCHAR Priority; // current (dynamic) priority
ULONG WaitTime;
LIST_ENTRY WaitListEntry;
SINGLE_LIST_ENTRY SwapListEntry;
KSPIN_LOCK ThreadLock;
} KTHREAD, *PKTHREAD;The embedded DISPATCHER_HEADER is the same header found at the top of every waitable kernel object and is what ties the thread into wait queues.
_KPRCB (Kernel Processor Control Block)
Each logical processor has a KPCR; inside it sits a KPRCB carrying that CPU’s scheduling state.
typedef struct _KPRCB {
// ...
PKTHREAD CurrentThread; // executing thread on this CPU
PKTHREAD NextThread; // pending preemption candidate
PKTHREAD IdleThread; // per-CPU idle thread
LIST_ENTRY ReadyListHead[32]; // dispatcher ready queues (per priority)
ULONG ReadySummary; // bitmask of non-empty ready queues
// ...
} KPRCB, *PKPRCB;_KPROCESS (Process Control Block)
Embedded as EPROCESS.Pcb, it provides the per-process scheduling defaults:
| Field | Purpose |
|---|---|
BasePriority | Process base priority; seeds new threads |
QuantumReset | Quantum value assigned to new threads |
ThreadListHead | Doubly-linked list of all _KTHREADs in the process |
ReadyListHead | Ready-but-swapped-out threads |
4. Dispatcher Ready Queues and ReadySummary
The Dispatcher Ready Queue is the per-CPU array KPRCB.ReadyListHead[32] — one LIST_ENTRY per priority level. Each non-empty entry is a FIFO of KTHREAD structures in the Ready state.
To avoid scanning all 32 queues, the kernel maintains a 32-bit ReadySummary bitmask: bit n is set when ReadyListHead[n] is non-empty. The dispatcher then selects the next thread in O(1):
// Conceptual scheduler inner loop (pseudo-code; not a real symbol).
ULONG mask = Prcb->ReadySummary;
if (mask) {
ULONG idx;
_BitScanReverse(&idx, mask); // highest set bit = top priority
PKTHREAD next = CONTAINING_RECORD(
RemoveHeadList(&Prcb->ReadyListHead[idx]),
KTHREAD, WaitListEntry);
if (IsListEmpty(&Prcb->ReadyListHead[idx]))
Prcb->ReadySummary &= ~(1u << idx);
return next;
}
return Prcb->IdleThread;5. Quantum Mechanics
A quantum is the slice of CPU time a thread is allowed to consume before the scheduler considers rotating it. WMI exposes two relevant properties: QuantumLength (clock ticks per quantum) and QuantumType (fixed vs. variable). Windows client SKUs default to variable quantum, giving the foreground process longer slices; server SKUs default to fixed long quantum to favor batch throughput.
Internally, quantum is tracked in units of 3 per clock tick — a “full” quantum is 18 units (client) or 36 units (server). KTHREAD.QuantumTarget holds the cycle target; on each clock tick, the kernel decrements and, on expiry, transfers control to KiQuantumEnd().
The foreground boost is governed by the registry value:
HKLM\SYSTEM\CurrentControlSet\Control\PriorityControl\Win32PrioritySeparationThe lowest six bits encode foreground-vs-background quantum behavior; bits 0–1 specifically choose the foreground boost level (0 none, 1 medium, 2 high). The kernel mirrors this into the global PsPrioritySeparation.
Internal scheduler routines you will see in symbols:
| Function | Purpose |
|---|---|
KiQuantumEnd | Invoked at clock interrupt when quantum expires |
KiSelectNextThread | Selects next Ready thread for the current CPU |
KiDeferredReadyThread | Places a thread in DeferredReady before final dispatch |
KxQueueReadyThread | Inserts a thread into the per-CPU ready queue |
KiReadyThread | Transitions a thread to the Ready state |
KiSwapThread / KiSwapContext | Performs the actual context switch |
6. Thread Selection: The Dispatch Path
A typical preemption follows this path:
- Clock interrupt fires on the local CPU.
KiQuantumEnd()decrementsKTHREAD.Quantum; if it has reached zero, the thread is moved out of Running.KiSelectNextThread()consultsKPRCB.ReadySummaryto find the highest non-empty queue.- The chosen thread is removed from
ReadyListHead[idx]and routed throughKiDeferredReadyThread(). KxQueueReadyThread()places the preempted thread back intoReadyListHead[oldPrio](FIFO tail) so round-robin holds within its level.KiSwapThread()→KiSwapContext()saves outgoing register state, loads the incoming thread’s stack and registers, and returns into the new thread.
If a wake event makes a higher-priority thread Ready while another thread is Running, the dispatcher instead writes the candidate into KPRCB.NextThread, raises an IPI on the target CPU, and the preemption fires on return-from-interrupt — without waiting for quantum expiry.

7. Priority Boosts and Decay
Dynamic-band threads (1–15) do not stay at their base priority. The kernel temporarily boosts them in response to events and decays the boost as they consume CPU.
| Event | Boost |
|---|---|
| I/O completion (keyboard / mouse) | +6 |
| I/O completion (disk / network) | +1 |
| Foreground window activation | controlled by PsPrioritySeparation |
| Wait satisfied on executive event | +1 |
| Starvation avoidance (Balance Set Manager) | up to 15 for one quantum |
| Decay (CPU-bound thread at quantum end) | −1 toward base |
The Balance Set Manager (KeBalanceSetManager) periodically scans ready queues and elevates threads that have been Ready but never running for ~4 seconds to priority 15 for a single quantum — preventing indefinite starvation by higher-priority work. Real-time threads (16–31) are never boosted or decayed; their priority is exactly what was set.
8. Multiprocessor Scheduling, Affinity, and NUMA
Each CPU has its own ready queues, so dispatch decisions are mostly local. To preserve cache and NUMA locality, the scheduler picks an ideal processor per thread and prefers to dispatch on that CPU, falling back to other CPUs in the thread’s affinity mask when the ideal is busy.
// Pin a worker thread to CPU 2, with CPU 2 as its ideal processor.
DWORD_PTR mask = (DWORD_PTR)1 << 2;
SetThreadAffinityMask(hThread, mask);
SetThreadIdealProcessor(hThread, 2);For >64 logical processors, threads belong to processor groups, set via SetThreadGroupAffinity. Kernel-mode equivalents are KeSetSystemAffinityThread and KeSetIdealProcessorThread. Misconfigured affinity is a real performance and detection hazard — a thread pinned off-node walks remote memory and pollutes another CPU’s cache.
9. Thread States: The Full State Machine
The KTHREADSTATE enum tracks every transition. The values you will see in KTHREAD.State:
| State | Meaning |
|---|---|
Initialized | Thread structure created, not yet schedulable |
Ready | Schedulable; sitting on ReadyListHead[priority] |
Standby | Selected as KPRCB.NextThread, about to run |
Running | Currently executing on a CPU |
Waiting | Blocked on a dispatcher object |
Transition | Wait satisfied, but kernel stack is paged out |
DeferredReady | Will be made Ready on a specific CPU |
Terminated | Final state before structure teardown |
A normal cycle looks like Initialized → Ready → Standby → Running → Waiting → Ready …. KPRCB.NextThread is non-NULL exactly while a target CPU has a Standby thread queued.

10. Observing the Scheduler with WinDbg and ETW
Live kernel inspection in WinDbg:
0: kd> !pcr ; current processor control region
0: kd> !prcb ; current processor control block
0: kd> dt nt!_KPRCB CurrentThread NextThread ReadySummary @$prcb
0: kd> dt nt!_KTHREAD Priority Quantum State Preempted @$thread
0: kd> !ready ; all ready threads, sorted by priority
0: kd> !thread <addr> 1f ; full thread state including stackThe ReadyListHead walk per-priority:
0: kd> dx -r1 ((nt!_KPRCB*)@$prcb)->ReadyListHead
0: kd> !list "-t nt!_KTHREAD.WaitListEntry.Flink -e -x \"dt nt!_KTHREAD @$extret Priority\" \
((nt!_KPRCB*)@$prcb)->ReadyListHead[15].Flink"For live system-wide capture, use ETW:
xperf -on PROC_THREAD+LOADER+CSWITCH+DISPATCHER -stackwalk CSwitch
xperf -d sched.etlThe primary providers carrying scheduler telemetry:
| Provider | GUID | Key events |
|---|---|---|
Microsoft-Windows-Kernel-Process | {22FB2CD6-0E7B-422B-A0C7-2FAD1FD0E716} | CSwitch (36), ReadyThread (50) |
Microsoft-Windows-Kernel-Thread | {3D6FA8D1-FE05-11D0-9DDA-00C04FD7BA7C} | Thread create/terminate, priority change |
NT Kernel Logger | {9E814AAD-3204-11D2-9A82-006008A86939} | CSWITCH, DISPATCHER groups |
A user-mode helper to enumerate per-thread priority without OpenProcess:
import ctypes
from ctypes import wintypes
ntdll = ctypes.WinDLL("ntdll")
# SystemProcessInformation = 5; walks _SYSTEM_PROCESS_INFORMATION entries
# Each entry trails an array of SYSTEM_THREAD_INFORMATION with Priority/BasePriority.
buf = (ctypes.c_byte * (1024 * 1024))()
ret_len = wintypes.ULONG()
ntdll.NtQuerySystemInformation(5, buf, ctypes.sizeof(buf), ctypes.byref(ret_len))
# parse _SYSTEM_PROCESS_INFORMATION + _SYSTEM_THREAD_INFORMATION here11. Common Attacker Techniques
Scheduler manipulation is rarely a standalone objective — it is a force multiplier for injection, evasion, and defense impairment.
| Technique | Description |
|---|---|
| Thread execution hijacking | OpenThread → SuspendThread → VirtualAllocEx + WriteProcessMemory → SetThreadContext → ResumeThread. Post-resume, attacker controls priority and CPU affinity of the hijacked thread. |
| Real-time priority abuse | Set malicious thread to THREAD_PRIORITY_TIME_CRITICAL under REALTIME_PRIORITY_CLASS (priority 31) to dominate the CPU and starve EDR scanners. Requires SeIncreaseBasePriorityPrivilege. |
| EDR/AV starvation | Open handles to defender process threads with THREAD_SET_INFORMATION and downgrade them via SetThreadPriority(THREAD_PRIORITY_IDLE) to delay real-time detection. |
| Affinity pinning for evasion | Pin malicious threads to a CPU not covered by an EDR’s per-CPU sampling profiler, or off-NUMA-node, to skew profilers and ETW stack walks. |
Win32PrioritySeparation tampering | Modify the registry value to alter foreground boost behavior, hurting interactive defensive tooling. |
| Quantum throttling via Job Objects | Apply JOB_OBJECT_CPU_RATE_CONTROL to constrain a defender process’s CPU budget. |

12. Defensive Strategies & Detection
Scheduler-level abuse is observable through ETW context-switch streams, sensitive-privilege auditing, registry auditing, and process-access telemetry. Sysmon alone is insufficient — pair it with kernel ETW.
| Sysmon Event ID | Name | Relevance |
|---|---|---|
1 | Process Create | Captures process priority class and parent lineage |
8 | CreateRemoteThread | Cross-process thread creation; often precedes priority manipulation |
10 | ProcessAccess | OpenThread with THREAD_SET_INFORMATION indicates intent to alter priority/context |
13 | RegistryValueSet | Modification of Win32PrioritySeparation and other PriorityControl values |
Critical Windows audit events:
4673— Sensitive Privilege Use. CatchesSeIncreaseBasePriorityPrivilegeinvocation, required for real-time priority.4656/4663— Handle/Object Access. Catches handles opened to thread objects withTHREAD_SET_INFORMATION.4657— Registry value modified. CatchesWin32PrioritySeparationchanges.4688— Process creation (with command-line auditing enabled).
Conceptual Sigma rule for unexpected real-time priority use:
title: Sensitive Privilege Use - SeIncreaseBasePriorityPrivilege from Non-System Process
logsource:
product: windows
service: security
detection:
selection:
EventID: 4673
PrivilegeList|contains: 'SeIncreaseBasePriorityPrivilege'
filter_system:
SubjectUserSid:
- 'S-1-5-18' # LocalSystem
- 'S-1-5-19' # LocalService
- 'S-1-5-20' # NetworkService
condition: selection and not filter_system
level: highHardening checklist:
- Restrict
SeIncreaseBasePriorityPrivilegevia Group Policy → User Rights Assignment to only the accounts that require it. - Audit
Win32PrioritySeparationwith Sysmon Event ID13or registry SACL → Event ID4657. - Baseline
CSwitchpriority distributions via ETW; alert on sustained user-mode threads scheduled at priority ≥ 16 outside an allowlist. - Deploy EDR that registers
PsSetCreateThreadNotifyRoutineandObRegisterCallbacksto observe thread creation, handle stripping, and priority changes in kernel. - Enclose untrusted code in Job Objects with
JobObjectCpuRateControlInformationand basic UI restrictions to prevent it from starving other processes.
13. Tools for Scheduler Analysis
| Tool | Description | Link |
|---|---|---|
| WinDbg (kernel) | !ready, !thread, !pcr, !prcb, dt nt!_KTHREAD/_KPRCB for live scheduler inspection | learn.microsoft.com |
| Windows Performance Recorder / xperf | Captures CSwitch, ReadyThread, DISPATCHER ETW events with stack walks | learn.microsoft.com |
| Windows Performance Analyzer | Visualizes CPU usage, context switches, and per-thread priority timelines | learn.microsoft.com |
| Process Hacker / System Informer | Live per-thread state, base priority, dynamic priority, ideal CPU, affinity mask | systeminformer.sourceforge.io |
| Process Explorer | Per-thread CPU, priority class, kernel/user stacks | sysinternals.com |
| Process Monitor | Captures Process Create, registry writes (Win32PrioritySeparation) | sysinternals.com |
| Sysmon | Events 1, 8, 10, 13 for thread creation, cross-process access, registry edits | sysinternals.com |
| Volatility 3 | Offline thread enumeration (windows.threads) and priority analysis from memory dumps | volatilityfoundation.org |
14. MITRE ATT&CK Mapping
| Technique | MITRE ID | Detection |
|---|---|---|
| Process Injection | T1055 | Sysmon 10 (ProcessAccess), ETW thread create with foreign-process parentage |
| Thread Execution Hijacking | T1055.003 | Sysmon 10 with THREAD_SET_INFORMATION / THREAD_SET_CONTEXT access; SuspendThread/ResumeThread pairs in EDR telemetry |
| Scheduled Task / Job | T1053 | Audit 4698 for task creation; monitor Job Object CPU-rate limits applied to defensive processes |
| Impair Defenses: Disable or Modify Tools | T1562.001 | Sysmon 10 against AV/EDR lsass.exe, MsMpEng.exe with THREAD_SET_INFORMATION; priority drops via ETW Microsoft-Windows-Kernel-Thread |
Note: ATT&CK does not currently track “Thread Priority Manipulation” as a standalone technique. Treat priority abuse as a sub-mechanism of
T1055.003andT1562.001.
15. Summary
- Windows is a preemptive, priority-based thread scheduler with 32 levels and per-CPU ready queues — priority always wins, quantum only rotates equal-priority threads.
- The dispatcher uses
KPRCB.ReadySummaryplusReadyListHead[32]to pick the next thread in O(1) via highest-set-bit scan. - Quantum is tracked in 3-unit-per-tick increments on
KTHREAD.QuantumTarget, with foreground boost governed byWin32PrioritySeparation/PsPrioritySeparation. - Dynamic threads (1–15) are subject to I/O, foreground, and starvation boosts plus decay; real-time threads (16–31) are not.
- Attackers abuse the scheduler via thread hijacking, real-time priority escalation, EDR starvation, and affinity pinning — detect via ETW
CSwitch, Sysmon8/10/13, and Event ID4673forSeIncreaseBasePriorityPrivilege.
Related Tutorials
- APCs: Asynchronous Procedure Calls and Thread Hijacking Surface
- IRQL Levels: Interrupt Request Priorities Explained
- Threads and the TEB (Thread Environment Block)
- Access Tokens and Privileges: The Kernel’s Security Context
- SIDs and Security Descriptors: Identity in Windows Security
References
- Scheduling Priorities – Win32 Apps | Microsoft Learn
- Priority Boosts – Win32 Apps | Microsoft Learn
- SetThreadPriority function (processthreadsapi.h) – Win32 Apps | Microsoft Learn
- Windows Kernel Internals: Thread Scheduling (Microsoft / U-Tokyo Lecture PDF)
- SetPriorityClass function (processthreadsapi.h) – Win32 Apps | Microsoft Learn
APCs: Asynchronous Procedure Calls and Thread Hijacking Surface
Objective: Understand the Windows Asynchronous Procedure Call mechanism from the kernel up — the
KAPC/KAPC_STATEstructures, the dispatch path throughKiInsertQueueApcandKiDeliverApc, the alertable-wait requirement, and the three abuse variants (classic, early-bird, special user APC) used for thread hijacking and process injection — and detect them with Sysmon, ETW-TI, and audit policy.
1. APC Fundamentals — What the OS Actually Uses APCs For
An Asynchronous Procedure Call is a function that executes asynchronously in the context of a specific thread. When the kernel queues an APC, it raises a software interrupt and arranges for the routine to run the next time that thread is dispatched. Every thread has its own APC queue — APCs are inherently thread-targeted, which is exactly why offensive tooling loves them.
The OS itself relies on APCs for normal work:
- I/O completion:
ReadFileEx,WriteFileEx, andSetWaitableTimerdeliver their completion callback via a user-mode APC queued back to the issuing thread. - File-system filter callbacks: normal kernel APCs are widely used by file systems and minifilters.
- Wait abortion: queuing a user APC against a thread in an alertable wait satisfies the wait with
STATUS_USER_APC.
Understanding APCs means understanding three things in sequence: who can queue them, when they fire, and what the thread looks like at the moment they fire.
2. The Three Flavours of APCs
APCs differ by IRQL and by who is allowed to queue them. The kernel maintains distinct semantics for each.
| Type | IRQL | Notes |
|---|---|---|
| Special Kernel APC | APC_LEVEL | Runs in kernel mode at IRQL APC_LEVEL; preempts user-mode code and kernel-mode code executing at PASSIVE_LEVEL. Used by the OS for operations such as I/O request completion. |
| Normal Kernel APC | PASSIVE_LEVEL | Runs in kernel mode at PASSIVE_LEVEL; preempts all user-mode code, including user APCs. Generally used by file systems and file-system filter drivers. |
| User-mode APC | PASSIVE_LEVEL | Generated by an application. The target thread must be in an alertable state for a user-mode APC to run. |
Unlike deferred procedure calls (DPCs), which run in arbitrary thread context, an APC always executes inside a specific thread’s context — that property is what makes APCs both useful for I/O completion and dangerous as an injection primitive.

3. Kernel Structures: KAPC, KAPC_STATE, KTHREAD
A queued APC is represented in the kernel by a KAPC object. The thread tracks its pending APCs via a KAPC_STATE embedded in KTHREAD.
// Conceptual layout — field names are illustrative; confirm against the
// target Windows build with `dt nt!_KAPC` / `dt nt!_KAPC_STATE` in WinDbg.
typedef struct _KAPC {
UCHAR Type;
UCHAR SpareByte0;
UCHAR Size;
UCHAR SpareByte1;
ULONG SpareLong0;
struct _KTHREAD *Thread;
LIST_ENTRY ApcListEntry;
PKKERNEL_ROUTINE KernelRoutine;
PKRUNDOWN_ROUTINE RundownRoutine;
PKNORMAL_ROUTINE NormalRoutine;
PVOID NormalContext;
PVOID SystemArgument1;
PVOID SystemArgument2;
CCHAR ApcStateIndex;
KPROCESSOR_MODE ApcMode;
BOOLEAN Inserted;
} KAPC, *PKAPC;
typedef struct _KAPC_STATE {
LIST_ENTRY ApcListHead[2]; // [0] = kernel APCs, [1] = user APCs
struct _KPROCESS *Process;
BOOLEAN KernelApcInProgress;
BOOLEAN KernelApcPending;
BOOLEAN UserApcPending;
// SpecialUserApcPending was added later for RS5+ Special User APCs.
} KAPC_STATE, *PKAPC_STATE;Key fields the dispatcher and attackers both care about:
KAPC.NormalRoutine— the function the thread will eventually execute.KAPC.NormalContext,SystemArgument1,SystemArgument2— arguments passed toNormalRoutine.KAPC.ApcMode—KernelModevsUserMode, controls which queue and which delivery path.KAPC_STATE.ApcListHead[2]— two doubly-linked lists; index 0 holds kernel-mode APCs, index 1 holds user-mode APCs.KAPC_STATE.UserApcPending— set toTRUEwhen a user APC is queued and the thread is in an alertable wait; this is the signal that breaks the wait withSTATUS_USER_APC.
4. The Alertable Wait Requirement
A user-mode APC does not fire whenever the kernel wants — it fires only when the target thread is willing to be interrupted. A thread enters an alertable state by calling one of:
SleepEx()SignalObjectAndWait()MsgWaitForMultipleObjectsEx()WaitForMultipleObjectsEx()WaitForSingleObjectEx()
with the bAlertable parameter set to TRUE. Additionally, ReadFileEx, WriteFileEx, and SetWaitableTimer are themselves implemented using APCs as their completion-notification mechanism — so threads driving overlapped I/O routinely sit in alertable waits.
This alertable-state requirement is the single most important property to understand offensively and defensively:
- Offensively, it dictates target selection. Long-lived service threads in
svchost.exeorexplorer.exethat pump I/O are reliable targets; threads that never enter an alertable wait will never run a queued user APC. - Defensively, it explains why the classic injection works against some processes and not others — and why attackers eventually moved to Special User APCs to remove the dependency entirely (§9).
5. Win32 → Native → Kernel Call Chain
Queuing a user APC traverses three layers.
| API / Symbol | Layer | Description |
|---|---|---|
QueueUserAPC | Win32 (kernel32.dll) | Queues a user-mode APC to a target thread. |
NtQueueApcThread | NT native (ntdll.dll) | Syscall used internally by QueueUserAPC to deliver the APC. |
NtQueueApcThreadEx | NT native | Extended form; RS5 introduced Special User APCs queued by passing 1 as the reserve handle. |
NtQueueApcThreadEx2 | NT native | Newer variant exposing both UserApcFlags and MemoryReserveHandle. |
QueueUserAPC2 | kernelbase.dll | Wrapper that exposes Special User APCs to user code. |
KeInsertQueueApc | Kernel | Attaches the initialized KAPC to the target thread’s queue. |
KiDeliverApc | Kernel | Dispatches pending APCs at the kernel→user transition. |
ntdll!RtlDispatchAPC | ntdll | Trampoline in user mode that calls the caller-supplied APCProc. |
An important internal detail: when you call QueueUserAPC(pfn, hThread, dwData), the function pointer ntdll actually hands to NtQueueApcThread is not your pfn — it is ntdll!RtlDispatchAPC, and your pfn is passed as a parameter. This is why call-stack-aware EDRs frequently see RtlDispatchAPC as the immediate caller of the suspicious user-mode routine.
The dispatch sequence for a user-mode APC:
- Caller obtains a thread handle with
THREAD_SET_CONTEXTaccess. QueueUserAPC→NtQueueApcThread→ kernel entersKiInsertQueueApc.KiInsertQueueApcchecks whether the target is in an alertable wait withWaitMode == UserMode. If yes, it setsUserApcPending = TRUEand completes the wait withSTATUS_USER_APC.- On the kernel→user transition,
KiDeliverApcredirects execution tontdll!RtlDispatchAPC, which invokes the originalAPCProc.

6. Inspecting APC State in WinDbg
Read-only kernel introspection lets defenders and learners watch the structures the dispatcher mutates.
0: kd> !process 0 0 lsass.exe
0: kd> .process /r /p <EPROCESS>
0: kd> !thread <ETHREAD>
0: kd> dt nt!_KTHREAD <addr> ApcState
0: kd> dt nt!_KAPC_STATE <addr+offset>
+0x000 ApcListHead : [2] _LIST_ENTRY
+0x020 Process : Ptr64 _KPROCESS
+0x028 KernelApcInProgress : UChar
+0x029 KernelApcPending : UChar
+0x02a UserApcPending : UChar
0: kd> !list "-t nt!_KAPC.ApcListEntry.Flink -e -x \"dt nt!_KAPC @$extret\" <ApcListHead[1]>"Walking ApcListHead[1] for any thread reveals every pending user APC — its NormalRoutine, NormalContext, and ApcMode. On a healthy thread you typically see nothing; finding NormalRoutine pointing into a private RX region inside a system process is a classic incident-response artifact.
7. Classic APC Injection
The textbook variant. Every API call below is observable; the technique relies entirely on existing, documented APIs.
// Educational illustration of the API call chain only.
// No payload is included; `payload` is a placeholder used by defenders to
// recognize the pattern. Authorized testing only.
#include <windows.h>
#include <tlhelp32.h>
BOOL InjectViaAPC(DWORD pid, DWORD tid, const BYTE *payload, SIZE_T cb) {
HANDLE hProc = OpenProcess(
PROCESS_VM_OPERATION | PROCESS_VM_WRITE | PROCESS_QUERY_INFORMATION,
FALSE, pid);
if (!hProc) return FALSE;
HANDLE hThread = OpenThread(THREAD_SET_CONTEXT, FALSE, tid);
if (!hThread) { CloseHandle(hProc); return FALSE; }
LPVOID remote = VirtualAllocEx(hProc, NULL, cb,
MEM_COMMIT | MEM_RESERVE,
PAGE_EXECUTE_READWRITE);
WriteProcessMemory(hProc, remote, payload, cb, NULL);
// QueueUserAPC schedules execution; it fires only when the target
// thread enters an alertable wait.
QueueUserAPC((PAPCFUNC)remote, hThread, 0);
CloseHandle(hThread);
CloseHandle(hProc);
return TRUE;
}Trigger conditions:
- The target thread (
tid) must enter an alertable wait. In long-lived service hosts this happens routinely. - The handle to the thread must carry
THREAD_SET_CONTEXT. This is the most reliable single indicator: Sysmon EID 10 with aGrantedAccessmask coveringTHREAD_SET_CONTEXTagainst a high-value target image is the canonical detection (§12).
Notably, no new thread is created in the victim process — CreateRemoteThread is not called. This is exactly why APC injection evades Sysmon EID 8.
8. Early-Bird APC Injection
Classic injection has one weakness: you cannot predict when the victim thread will next become alertable. Early-bird removes the guesswork by injecting into a process you create yourself in a suspended state, then queuing the APC against the main thread before it has executed a single instruction.
// Educational pseudocode — illustrates API sequence, not payload.
STARTUPINFOA si = { sizeof(si) };
PROCESS_INFORMATION pi = { 0 };
CreateProcessA(NULL, "C:\\Windows\\System32\\notepad.exe", NULL, NULL,
FALSE, CREATE_SUSPENDED, NULL, NULL, &si, &pi);
LPVOID remote = VirtualAllocEx(pi.hProcess, NULL, cb,
MEM_COMMIT | MEM_RESERVE,
PAGE_EXECUTE_READWRITE);
WriteProcessMemory(pi.hProcess, remote, payload, cb, NULL);
QueueUserAPC((PAPCFUNC)remote, pi.hThread, 0);
// Thread services its APC queue as part of initialization, *before*
// running the original entry point.
ResumeThread(pi.hThread);Why it works: when a newly created thread starts, the kernel transitions into user mode through ntdll!LdrInitializeThunk, which performs internal alertable waits during loader work. Any user APC queued before ResumeThread is delivered during that early window — before the legitimate entry point runs.
This variant straddles two ATT&CK sub-techniques: it is APC injection (T1055.004) but it also resembles Thread Execution Hijacking (T1055.003) because the suspended-thread-then-redirect pattern is structurally the same primitive.

9. Special User APCs (RS5+): Bypassing the Alertable Requirement
Starting with Windows 10 RS5, the kernel introduced Special User APCs. The key behavioural change: these APCs are delivered with Mode == KernelMode to force a thread signal. The thread is interrupted mid-execution to run the special APC — the alertable-state requirement is gone.
They are queued via NtQueueApcThreadEx (passing 1 as the reserve handle) or through NtQueueApcThreadEx2, which exposes a flags field. kernelbase!QueueUserAPC2 is the documented Win32 wrapper.
// Conceptual signatures — confirm flag values and syscall semantics
// against the target SDK / Windows build before relying on them.
typedef NTSTATUS (NTAPI *pNtQueueApcThreadEx2)(
HANDLE ThreadHandle,
HANDLE UserApcReserveHandle, // optional reserve object
ULONG ApcFlags, // e.g. QUEUE_USER_APC_FLAGS_SPECIAL_USER_APC
PVOID ApcRoutine,
PVOID SystemArgument1,
PVOID SystemArgument2,
PVOID SystemArgument3);
// Pseudocode dispatch — `Special User APC` interrupts a running thread
// without requiring it to be in SleepEx / WaitForSingleObjectEx.
pNtQueueApcThreadEx2 fn = (pNtQueueApcThreadEx2)
GetProcAddress(GetModuleHandleW(L"ntdll.dll"), "NtQueueApcThreadEx2");
fn(hThread,
NULL,
QUEUE_USER_APC_FLAGS_SPECIAL_USER_APC, // forces in-execution delivery
remote_routine,
NULL, NULL, NULL);Internally the kernel sets SpecialUserApcPending (added to KAPC_STATE for this purpose) and arranges delivery at the next return-to-user-mode opportunity regardless of wait state. This is a meaningful escalation of the primitive — it converts APC injection from “wait until the thread cooperates” to “interrupt the thread now.”
10. Real-World Threat Actor Usage
APC injection is documented at the technique level rather than the family level here; defenders should treat it as a primitive that recurs across many tradecraft variants:
- DOUBLEPULSAR used kernel-mode APC injection to redirect user-mode threads from a kernel implant.
- Multiple commodity and APT families catalogued under MITRE
T1055.004employ classic user-APC injection againstsvchost.exe,explorer.exe, and other long-running hosts. - The AtomBombing family of injection variants combines
GlobalAddAtom/NtQueueApcThreadto stage code through atom tables, then dispatch via APC. - Recent research (Check Point’s Thread Name-Calling) chains thread-name primitives with APC dispatch to evade EDR userland hooks.
11. Common Attacker Techniques
| Technique | Description |
|---|---|
| Classic APC Injection | OpenProcess → OpenThread(THREAD_SET_CONTEXT) → VirtualAllocEx → WriteProcessMemory → QueueUserAPC. Fires when the target thread next enters an alertable wait. |
| Early-Bird APC | CreateProcess(CREATE_SUSPENDED) → write payload → QueueUserAPC → ResumeThread. APC fires during loader init, before the entry point. |
| Special User APC | NtQueueApcThreadEx / NtQueueApcThreadEx2 with QUEUE_USER_APC_FLAGS_SPECIAL_USER_APC — interrupts the thread mid-execution; no alertable wait required. |
| Kernel APC injection from a driver | Malicious driver calls KeInsertQueueApc directly against a user thread (DOUBLEPULSAR class). Mitigated by HVCI / driver signing. |
| Atom-table staged APC (AtomBombing) | Payload bytes shuttled into target via atom tables, then dispatched with NtQueueApcThread. Evades naive memory-write detections. |
| Self-APC for unhooking / staging | Queue an APC to the current thread + SleepEx(0, TRUE) to execute code outside hooked call paths. |
12. Defensive Strategies & Detection
APC injection is deliberately quiet — it does not create a remote thread and so does not emit Sysmon EID 8. Detection therefore pivots on the handle-acquisition and memory-staging stages, plus dedicated ETW.
12.1 Sysmon
| Event ID | Name | Why It Matters Here |
|---|---|---|
| EID 10 | ProcessAccess | Captures the OpenThread/OpenProcess step. GrantedAccess masks covering THREAD_SET_CONTEXT (0x0018) and PROCESS_VM_WRITE (0x0020) against high-value images are the strongest signal. |
| EID 8 | CreateRemoteThread | Will not fire for pure APC injection — but does fire for hybrid variants and is useful as a negative signal. |
| EID 1 | ProcessCreate | Detects CREATE_SUSPENDED parent/child pairs typical of Early-Bird. Combine with short process lifetimes. |
12.2 ETW — Microsoft-Windows-Threat-Intelligence
The Threat Intelligence ETW provider exposes a dedicated APC-injection sensor:
THREATINT_QUEUEUSERAPC_REMOTE_KERNEL_CALLER— logged byEtwTiLogInsertQueueUserApc/EtwTiLogQueueApcThread, invoked from insideKeInsertQueueApc. Introduced in Windows 10 build 1809.
Consumption requires a signed ELAM driver; the provider is reserved for AntiMalware-protected processes. In practice you receive this telemetry through your EDR vendor’s sensor.
12.3 Audit Policy
- Enable Detailed Tracking → Audit Process Access → Security log EIDs 4656 / 4663 on handle requests. Filter for
Object Type = Threadwith access masks includingTHREAD_SET_CONTEXT. - Enable Audit Process Creation → EID 4688 with full command-line logging. Pair with
CREATE_SUSPENDEDheuristics where parent process behaviour permits inference.
12.4 Sigma Detection (Conceptual)
title: Suspicious Cross-Process Handle Acquisition Consistent With APC Injection
id: 00000000-0000-0000-0000-000000000000
status: experimental
logsource:
product: windows
service: sysmon
detection:
selection_thread_ctx:
EventID: 10
GrantedAccess|contains:
- '0x0018' # THREAD_SET_CONTEXT | THREAD_GET_CONTEXT
- '0x1fffff' # PROCESS_ALL_ACCESS
TargetImage|endswith:
- '\lsass.exe'
- '\svchost.exe'
- '\explorer.exe'
- '\winlogon.exe'
selection_vm_write:
EventID: 10
GrantedAccess|contains: '0x0020' # PROCESS_VM_WRITE
timeframe: 5s
condition: selection_thread_ctx and selection_vm_write
falsepositives:
- Endpoint security products and legitimate debuggers
level: high12.5 Behavioural Heuristics
The fingerprint that hunts well: VirtualAllocEx (RWX) → WriteProcessMemory → NtQueueApcThread issued by the same source process within a short window. Even when individual calls are noisy, the ordering is rare in benign software.
12.6 PowerShell — Hunt for Suspicious ProcessAccess Masks
Get-WinEvent -LogName 'Microsoft-Windows-Sysmon/Operational' -FilterXPath @"
*[System[EventID=10]]
"@ |
Where-Object {
$_.Properties[5].Value -match '0x0018|0x001f|0x1fffff' -and
$_.Properties[6].Value -match 'lsass\.exe|svchost\.exe|winlogon\.exe'
} |
Select-Object TimeCreated,
@{n='Source'; e={$_.Properties[4].Value}},
@{n='Target'; e={$_.Properties[6].Value}},
@{n='Access';e={$_.Properties[5].Value}}12.7 Hardening
| Mitigation | Description |
|---|---|
| Protected Process Light (PPL) | LSASS as PPL-Antimalware blocks OpenThread(THREAD_SET_CONTEXT) from untrusted callers. |
| Credential Guard | Moves LSASS secrets into a VSM-isolated process, removing it as an APC target entirely. |
| HVCI / Code Integrity | Prevents unsigned kernel drivers from calling KeInsertQueueApc against arbitrary threads. |
ASR rule 9e6c4e1f-7d60-472f-ba1a-a39ef669e4b0 | Blocks credential theft from LSASS; complements but does not directly block APC injection. |
| Minimize alertable waits in sensitive code | Avoid SleepEx(n, TRUE) and other alertable waits in privileged service threads unless required. |
| ETW-TI via EDR | Deploy AV/EDR with an ELAM driver to consume Microsoft-Windows-Threat-Intelligence events in real time. |

13. Tools for APC Analysis
| Tool | Description | Link |
|---|---|---|
| WinDbg | Walk KTHREAD.ApcState, dump KAPC entries via !list, inspect UserApcPending. | microsoft.com |
| Process Hacker | Per-thread inspection, including private RX allocations and thread call stacks indicative of injected code. | processhacker.sourceforge.io |
| Sysmon | EID 10 / 8 / 1 telemetry for the handle-open and process-creation halves of the chain. | sysinternals.com |
Sysinternals handle.exe | Enumerate handles a suspect process holds (look for foreign Thread / Process handles). | sysinternals.com |
| Volatility 3 | Memory forensics: walk thread APC queues post-incident; identify injected RX regions. | volatilityfoundation.org |
| ETW Explorer / SilkETW | Inspect or subscribe to ETW providers (ETW-TI requires signed ELAM). | github.com |
| x64dbg | User-mode dynamic analysis of QueueUserAPC / RtlDispatchAPC call chains. | x64dbg.com |
14. MITRE ATT&CK Mapping
| Technique | MITRE ID | Detection |
|---|---|---|
| Process Injection | T1055 | Behavioural sequence: cross-process handle with VM-write rights followed by APC queuing. |
| Process Injection: Asynchronous Procedure Call | T1055.004 | Sysmon EID 10 with THREAD_SET_CONTEXT; ETW-TI THREATINT_QUEUEUSERAPC_REMOTE_KERNEL_CALLER. |
| Thread Execution Hijacking | T1055.003 | Early-Bird variant: CREATE_SUSPENDED process + THREAD_SET_CONTEXT handle + early-window APC. |
T1055.004 is the primary mapping for this tutorial. The Early-Bird variant (§8) overlaps with T1055.003 because the suspended-thread + redirection structure is the same primitive — defenders should detect both.
Summary
- APCs are a legitimate kernel facility for thread-targeted asynchronous work, and that property is exactly what makes them a first-class injection primitive.
- The dispatch chain is
QueueUserAPC→NtQueueApcThread→KiInsertQueueApc→KiDeliverApc→ntdll!RtlDispatchAPC→ caller routine; every layer is observable. - User APCs require an alertable wait; Early-Bird sidesteps this via
CREATE_SUSPENDED, and Special User APCs (NtQueueApcThreadEx2+QUEUE_USER_APC_FLAGS_SPECIAL_USER_APC) eliminate the requirement entirely. - APC injection deliberately evades Sysmon EID 8 — detection pivots on EID 10 with
THREAD_SET_CONTEXT(0x0018) andPROCESS_VM_WRITE(0x0020) against high-value targets, plusMicrosoft-Windows-Threat-IntelligenceETW (EtwTiLogInsertQueueUserApc). - Map to T1055.004 for classic / special-user APC, and additionally to T1055.003 for the Early-Bird suspended-thread variant; harden with PPL, Credential Guard, HVCI, and ETW-TI-consuming EDR.
Related Tutorials
- DPCs: Deferred Procedure Calls and Interrupt Deferral
- Windows Scheduler Internals: Priority Levels, Quantum, and Thread Selection
- System Calls and SSDT: How User Mode Reaches the Kernel
- Threads and the TEB (Thread Environment Block)
- Access Tokens and Privileges: The Kernel’s Security Context
References
- Process Injection: Asynchronous Procedure Call, Sub-technique T1055.004 – MITRE ATT&CK
- Process Injection: Thread Execution Hijacking, Sub-technique T1055.003 – MITRE ATT&CK
- Asynchronous Procedure Calls – Win32 apps | Microsoft Learn
- QueueUserAPC function (processthreadsapi.h) – Win32 apps | Microsoft Learn
- Types of APCs – Windows Kernel Drivers | Microsoft Learn
- Behavioral Detection of APC Injection via Remote Thread Queuing, Detection Strategy DET0100 – MITRE ATT&CK
DPCs: Deferred Procedure Calls and Interrupt Deferral
Objective: Understand how the Windows kernel uses Deferred Procedure Calls (DPCs) to move work out of high-IRQL interrupt service routines down to
DISPATCH_LEVEL, covering theKDPCstructure, IRQL mechanics, the full queue-to-callback lifecycle, threaded and timer DPCs, the DPC watchdog, and how defenders detect kernel-mode abuse of the DPC mechanism.
1. The Interrupt Deferral Problem
When a hardware device raises an interrupt, the kernel dispatches to an Interrupt Service Routine (ISR) running at DIRQL — a device IRQL higher than the scheduler itself. At that level the processor cannot wait, cannot touch pageable memory, and blocks all lower-priority interrupts on that CPU. An ISR that lingers degrades the entire system; the guidance is that ISRs should not run longer than 25 microseconds.
Windows therefore uses a two-phase interrupt model. The ISR does the minimum work needed to quiesce the device (acknowledge the interrupt, snapshot status), then schedules a Deferred Procedure Call to perform the heavier processing later, at a gentler IRQL. The DPC executes at DISPATCH_LEVEL, which is still too high for anything that touches pageable memory — but it is low enough to run the bulk of device servicing without starving other interrupts.
The essence of the DPC is deferring execution to gentler circumstances. It is the kernel’s primary tool for keeping ISRs short.
2. IRQL Levels: A Precise Map
The Interrupt Request Level (IRQL) is a per-processor priority that determines what code may run and what it may do. Any routine running at DISPATCH_LEVEL or above is not preemptable, runs to completion, and must reside in non-paged memory.
| IRQL Name | Value | Notes |
|---|---|---|
PASSIVE_LEVEL | 0 | Normal user/kernel thread execution; paging and waiting allowed |
APC_LEVEL | 1 | Asynchronous Procedure Calls |
DISPATCH_LEVEL | 2 | DPC execution, scheduler, spin locks; no paging, no waiting |
DIRQL | 3–11 (device-dependent) | Hardware ISRs run here |
An ISR at DIRQL cannot call functions that require PASSIVE_LEVEL. It instead schedules a DPC, which the kernel later runs at DISPATCH_LEVEL. Because DISPATCH_LEVEL still forbids page faults and blocking waits, a DPC routine and all data it touches must be non-paged.

3. The KDPC Structure Dissected
The KDPC is the structure in which the kernel keeps the state of a Deferred Procedure Call. It has always been explicitly undocumented — Microsoft labels it an opaque structure and warns drivers not to set members directly. The published layout from WDK/OSR headers is:
typedef struct _KDPC {
UCHAR Type; // DpcObject or ThreadedDpcObject
UCHAR Importance; // Low / Medium / High
USHORT Number; // target processor (directed DPCs)
LIST_ENTRY DpcListEntry; // links into per-processor DPC queue
PKDEFERRED_ROUTINE DeferredRoutine; // pointer to the callback function
PVOID DeferredContext; // driver-supplied context value
PVOID SystemArgument1; // extra arg passed to callback
PVOID SystemArgument2; // extra arg passed to callback
__volatile PVOID DpcData; // internal; pointer to KDPC_DATA
} KDPC, *PKDPC, *PRKDPC;| Field | Purpose |
|---|---|
Type | Distinguishes a normal DpcObject from a ThreadedDpcObject |
Importance | Controls queue insertion: MediumImportance = tail, HighImportance = head |
Number | Target logical processor, set via KeSetTargetProcessorDpc |
DeferredRoutine | Pointer to the KDEFERRED_ROUTINE callback |
DeferredContext | Opaque context the driver receives back in the callback |
SystemArgument1/2 | Caller-supplied arguments passed through to the callback |
DpcData | Volatile internal pointer to the per-processor KDPC_DATA; non-NULL while queued |
The DpcData field is the kernel’s bookkeeping anchor: before Windows 8.1 it pointed directly at a KDPC_DATA structure, and its non-NULL state indicates the DPC is currently queued. Because DeferredRoutine is a raw function pointer inside a writable structure, it is also a corruption target — covered in §10.
4. The DPC Lifecycle: From ISR to Callback
A DPC moves through four stages: allocate → initialize → queue → drain.
| API Function | Purpose |
|---|---|
KeInitializeDpc | Initializes a KDPC, binding a DeferredRoutine and DeferredContext |
KeInsertQueueDpc | Inserts the KDPC into the per-processor queue; returns FALSE if already queued |
IoRequestDpc | Convenience wrapper called from ISR context for the DpcForIsr pattern |
KeRemoveQueueDpc | Removes a pending (not-yet-fired) DPC from the queue |
Kernel code first allocates a KDPC in non-paged pool (or the device extension) so the object is resident when referenced from the ISR.
// C1 — allocate and initialize a DPC object
PKDPC pDpc = ExAllocatePool2(POOL_FLAG_NON_PAGED, sizeof(KDPC), 'cpDD');
if (pDpc) {
KeInitializeDpc(pDpc, MyCustomDpc, DeviceContext); // routine + context
}The callback must match the KDEFERRED_ROUTINE signature and runs at DISPATCH_LEVEL:
// C2 — DPC callback stub
VOID MyCustomDpc(
_In_ PKDPC Dpc,
_In_opt_ PVOID DeferredContext,
_In_opt_ PVOID SystemArgument1,
_In_opt_ PVOID SystemArgument2)
{
UNREFERENCED_PARAMETER(Dpc);
ASSERT(KeGetCurrentIrql() == DISPATCH_LEVEL); // invariant
// Non-paged, bounded work only — no waits, no page faults.
}The ISR queues the DPC. The return value of KeInsertQueueDpc enforces the single-instantiation guarantee: only one instance of a given KDPC can be queued at a time, so queuing it twice before it fires runs the routine once.
// C3 — queue from a mock ISR
BOOLEAN queued = KeInsertQueueDpc(pDpc, Arg1, Arg2);
if (!queued) {
// Already pending on a queue — the earlier request still stands.
}Device drivers commonly use the wrapper from inside their InterruptService routine:
// C4 — DpcForIsr pattern
BOOLEAN MyIsr(_In_ PKINTERRUPT Interrupt, _In_ PVOID Context) {
PDEVICE_OBJECT devObj = (PDEVICE_OBJECT)Context;
// ...acknowledge hardware quickly...
IoRequestDpc(devObj, devObj->CurrentIrp, NULL); // schedules DpcForIsr
return TRUE;
}When the processor returns from the interrupt, it checks its DPC queue; if entries are pending, the kernel raises IRQL to DISPATCH_LEVEL, drains the queue by invoking each DeferredRoutine, then lowers IRQL back down.

5. Per-Processor DPC Queues and KPRCB
Each logical processor owns a separate DPC queue, stored as a KDPC_DATA structure inside the processor’s KPRCB (Kernel Processor Control Block). This avoids cross-CPU locking on the common path.
KDPC_DATA carries the queue head, depth, count, and a spin lock:
typedef struct _KDPC_DATA {
LIST_ENTRY DpcListHead; // queued KDPC objects
ULONG DpcLock; // spin lock protecting the list
volatile ULONG DpcQueueDepth; // pending DPCs
ULONG DpcCount; // running total
} KDPC_DATA, *PKDPC_DATA;Exact
KDPC_DATAfield names vary by kernel build — confirm against a live PDB withdt nt!_KDPC_DATAbefore relying on offsets.
Because each queue is per-processor, the target processor of a DPC determines which CPU drains it. By default a DPC runs on the CPU that queued it, but it can be pinned elsewhere (§6) — a property attackers exploit to manipulate specific cores.

6. Controlling DPC Behaviour
| API Function | Purpose |
|---|---|
KeSetImportanceDpc | Sets Importance; HighImportance inserts at the queue head |
KeSetTargetProcessorDpc | Pins the DPC to a specific logical processor (directed DPC) |
KeRemoveQueueDpc | Dequeues a pending DPC; fails once the routine is already running |
DPCs have three priority levels — low, medium, high. Importance influences KeInsertQueueDpc: high-importance DPCs go to the head of the queue and are serviced first.
A directed DPC is created by binding it to a CPU before queuing. The pattern below — iterating over KeNumberProcessors and targeting each core — is the same primitive a rootkit weaponizes for CPU lockdown, so treat it as an educational illustration only:
// C5 — directed DPC setup (educational pattern)
for (CCHAR cpu = 0; cpu < KeNumberProcessors; cpu++) {
KeInitializeDpc(&pDpcArray[cpu], MyCustomDpc, NULL);
KeSetTargetProcessorDpc(&pDpcArray[cpu], cpu); // pin to logical CPU
KeSetImportanceDpc(&pDpcArray[cpu], HighImportance);
}Once a DPC begins executing it cannot be removed; KeRemoveQueueDpc only rescinds a still-pending entry.
7. Threaded DPCs
Since Windows Server 2003, a KDPC can represent either a normal DPC or a threaded DPC. In the threaded variant, the kernel — if it can arrange it — calls the routine back at PASSIVE_LEVEL from a highest-priority thread, allowing more flexible work. Support can be disabled, in which case the threaded DPC falls back to running at DISPATCH_LEVEL exactly like a normal DPC.
You initialize one with KeInitializeThreadedDpc and a CustomThreadedDpc routine. Because that routine can run at either PASSIVE_LEVEL or DISPATCH_LEVEL, it must synchronize correctly at both IRQLs:
// C7 — threaded DPC with dual-IRQL guard
KeInitializeThreadedDpc(&g_ThreadedDpc, MyThreadedDpc, NULL);
VOID MyThreadedDpc(_In_ PKDPC Dpc, _In_opt_ PVOID Ctx,
_In_opt_ PVOID A1, _In_opt_ PVOID A2) {
ASSERT(KeGetCurrentIrql() <= DISPATCH_LEVEL); // may be PASSIVE or DISPATCH
// Use locks valid at both levels.
}Threaded DPCs should be preferred over ordinary DPCs unless a particular DPC must never be preempted — not even by another DPC.
8. Timer DPCs and KTIMER
A DPC is also the callback mechanism for kernel timers. You associate a KDPC with a KTIMER and arm it; on expiry the kernel queues the DPC. KeSetTimerEx supports both one-shot and periodic timers.
// C6 — periodic timer DPC
KeInitializeTimerEx(&g_Timer, NotificationTimer);
KeInitializeDpc(&g_TimerDpc, MyCustomDpc, NULL);
LARGE_INTEGER due;
due.QuadPart = -10LL * 1000 * 1000; // 1 second, relative
KeSetTimerEx(&g_Timer, due, 1000 /* ms period */, &g_TimerDpc);Windows uses special timer DPCs internally for timer expiration and context switching. The same primitive — a recurring timer pointed at a non-paged callback — is the cleanest way a driver schedules background work, and the cleanest way a malicious driver re-enters its payload (§10).
9. The DPC Watchdog and Debugging
The kernel runs a DPC watchdog. Bug Check 0x00000133 (DPC_WATCHDOG_VIOLATION) fires when the watchdog detects either a single long-running DPC or a prolonged time spent at DISPATCH_LEVEL or above. The timing budgets are 100 microseconds for a DPC and 25 microseconds for an ISR. A malicious DPC spin-loop can therefore inadvertently trip the watchdog and crash the host.
Inspect live DPC state in the kernel debugger:
kd> !dpcs ; list pending DPCs per processor
kd> dt nt!_KDPC ; KDPC layout for this build
kd> dt nt!_KDPC_DATA ; per-processor queue structure
kd> !prcb ; processor control block (contains DpcData)
kd> !pcr ; processor control region!dpcs reveals each queued DPC’s DeferredRoutine address — the single most useful artifact, since an unknown or non-image-backed routine address is a strong anomaly.
10. Common Attacker Techniques
DPCs give kernel-mode malware a high-IRQL execution surface. Because code at DISPATCH_LEVEL is non-preemptable and runs to completion, it is ideal cover for Direct Kernel Object Manipulation (DKOM).
| Technique | Description |
|---|---|
| CPU lockdown / freeze-other-CPUs | Queue a directed KDPC to every non-current CPU via KeSetTargetProcessorDpc and spin, raising all secondary cores to DISPATCH_LEVEL to block interruption during a DKOM patch |
| Timer DPC payload | Arm a KTIMER whose DeferredRoutine points at attacker-controlled non-paged code, for recurring stealth execution |
| KDPC hijacking | Overwrite DeferredRoutine in a legitimate queued KDPC to redirect execution to a payload |
| Driver-based persistence | Load a malicious signed/BYOVD driver that registers a recurring timer DPC at load time |
The CPU-lockdown pattern is especially relevant to defenders: by parking every other core at DISPATCH_LEVEL, the rootkit can unlink processes, patch EDR callbacks, or hide drivers while no scheduler or AV thread can run.

11. Defensive Strategies & Detection
DPC objects live entirely in kernel memory and are not directly observable from user mode, so detection focuses on the driver that installs them and on kernel ETW timing telemetry.
Sysmon and Windows event telemetry:
| Event ID | Source | Relevance |
|---|---|---|
6 | Sysmon — Driver Loaded | Fires on every driver load; primary signal for kernel modules that register DPC routines |
7 | Sysmon — Image Loaded | Catches unsigned/anomalous modules entering kernel space |
7045 | Service Control Manager | New kernel-mode driver, especially from a non-standard path |
7040 | Service Control Manager | Service start-type change — driver persistence |
ETW providers: The NT Kernel Logger session with EVENT_TRACE_FLAG_DPC and EVENT_TRACE_FLAG_INTERRUPT records per-DPC timing and the routine address, exposing abnormally long-running or unknown-address DPC routines. Microsoft-Windows-Kernel-Processor-Power surfaces IRQL/watchdog events. Verify the exact flag constants against the current WDK evntrace.h.
Sigma anchor — unsigned/expired driver load:
title: Suspicious Kernel Driver Load (Unsigned or Expired)
logsource:
product: windows
service: sysmon
detection:
selection_unsigned:
EventID: 6
Signed: 'false'
selection_expired:
EventID: 6
SignatureStatus: 'Expired'
selection_path:
EventID: 6
ImageLoaded|contains: '\Temp\'
condition: selection_unsigned or selection_expired or selection_path
level: highHunt additionally for EventID 6 where ImageLoaded resolves outside \SystemRoot\System32\drivers\.
Hardening:
| Mitigation | Description |
|---|---|
| Driver Signature Enforcement (DSE) | Default on 64-bit Windows; blocks unsigned drivers that would install DPC routines |
| HVCI | Protects kernel code pages, raising the bar for DPC shellcode and DeferredRoutine overwrite |
| Kernel CET | Hardware shadow stack mitigates ROP-based DPC hijacking |
| DPC Watchdog | Built-in; Bug Check 0x133 catches long-running DPC loops, including malicious spin-locks |
| Vulnerable Driver Blocklist | HKLM\SYSTEM\CurrentControlSet\Control\CI\Config\VulnerableDriverBlocklistEnable blocks known BYOVD primitives |
| WDAC / Memory Integrity | Restrict which drivers may load, shrinking the DPC-abuse attack surface |
12. Tools for DPC Analysis
| Tool | Description | Link |
|---|---|---|
| WinDbg | !dpcs, dt nt!_KDPC, !prcb, !pcr live queue inspection | microsoft.com |
| Process Hacker | Driver/service enumeration and kernel module listing | processhacker.sourceforge.io |
| Windows Performance Recorder / xperf | Captures DPC/ISR ETW timing and routine addresses | microsoft.com |
| Sysmon | Driver-load (EID 6) and image-load (EID 7) telemetry | sysinternals.com |
| Volatility | Memory-forensic enumeration of drivers and kernel callbacks | volatilityfoundation.org |
| Ghidra | Static analysis of suspect drivers for KeInsertQueueDpc usage | ghidra-sre.org |
13. MITRE ATT&CK Mapping
| Technique | MITRE ID | Detection |
|---|---|---|
| Rootkit | T1014 | ETW DPC routine-address anomalies; !dpcs unknown routines |
| Boot/Logon Autostart: Kernel Modules | T1547.006 | Sysmon EID 6 / Event 7045 driver loads |
| Exploitation for Privilege Escalation | T1068 | HVCI/CET violations; KDPC.DeferredRoutine corruption |
| Impair Defenses: Disable/Modify Tools | T1562.001 | CPU-freeze DPC pattern halting EDR threads; watchdog 0x133 |
| Native API | T1106 | Driver use of KeInitializeDpc / KeInsertQueueDpc |
No dedicated ATT&CK sub-technique exists for DPC abuse as of ATT&CK v15; the techniques above are the parents. Verify current IDs at attack.mitre.org before publishing.
Summary
- DPCs are the kernel’s mechanism for deferring interrupt work from high-IRQL ISRs down to
DISPATCH_LEVEL, keeping ISRs under their 25 µs budget. - The opaque
KDPCstructure carries theDeferredRoutine, context, arguments, and aDpcDatapointer that marks whether it is queued on a per-processorKDPC_DATAlist in theKPRCB. - The lifecycle runs allocate →
KeInitializeDpc→KeInsertQueueDpc/IoRequestDpc→ per-CPU drain atDISPATCH_LEVEL, with a single-instantiation guarantee per object. - Rootkits abuse directed DPCs for CPU lockdown, timer DPCs for stealth re-entry, and
DeferredRoutinecorruption for hijacking — mapping toT1014,T1547.006, andT1562.001. - Detect via Sysmon Event ID 6 driver loads, NT Kernel Logger DPC timing telemetry, and the DPC watchdog (
0x133); harden with DSE, HVCI, Kernel CET, and the vulnerable driver blocklist.
Related Tutorials
- APCs: Asynchronous Procedure Calls and Thread Hijacking Surface
- IRQL Levels: Interrupt Request Priorities Explained
- System Calls and SSDT: How User Mode Reaches the Kernel
- Access Tokens and Privileges: The Kernel’s Security Context
- SIDs and Security Descriptors: Identity in Windows Security
References
- Introduction to DPC Objects – Windows Kernel Drivers | Microsoft Learn
- Managing Hardware Priorities (IRQL) – Windows Kernel Drivers | Microsoft Learn
- Deferred Procedure Call (DPC) – Kernel API Reference | Geoff Chappell, Software Analyst
- Deferred Procedure Call Details (KDPC Internals & Queuing Mechanisms) | OSR Online NT Insider
- Understanding Deferred Procedure Calls (DPCs) for Windows Vulnerability Research & Exploit Development | Medium / WaterBucket
- IRQLs: Close Encounters of the Rootkit Kind (DPCs & IRQL Abuse) | OffSec Blog
IRQL Levels: Interrupt Request Priorities Explained
Objective: Understand the Windows kernel’s Interrupt Request Level (IRQL) priority system — what each level means numerically and symbolically, how the HAL arbitrates hardware and software interrupts, which APIs query and change the IRQL, what kernel operations are legal at each level, and how malicious kernel code abuses IRQL semantics to evade defenders.
1. What Is an IRQL?
An Interrupt Request Level (IRQL) is a per-processor priority value that determines which kernel-mode support routines the currently executing code may legally call. It is an integer in the range 0–31, stored as type KIRQL (a typedef for UCHAR). Three levels — PASSIVE_LEVEL, APC_LEVEL, and DISPATCH_LEVEL — are referred to symbolically; the rest are usually named by value.
IRQL is per-processor, not per-thread. On x86 it lives in the Irql field of the _KPCR (Kernel Processor Control Region); on x64 it is mapped to the CR8 register (Task Priority Register). When the processor raises its IRQL, all interrupts at or below that level are masked. Higher-numbered interrupts preempt all lower-IRQL processing; once handled, the processor returns to the previous level. Raising and lowering must follow strict stack discipline — you only lower back to a level you previously raised from.
2. The IRQL Hierarchy
The Hardware Abstraction Layer (HAL) maps physical interrupt vectors to software IRQLs. The count of levels is architecture-dependent: x64 and Itanium expose 16 IRQLs; x86 exposes 32, owing to differences in interrupt-controller hardware. The canonical wdm.h symbolic definitions differ across architectures.
| Symbolic Name | x64 Value | x86 Value | Description |
|---|---|---|---|
PASSIVE_LEVEL / LOW_LEVEL | 0 | 0 | Normal thread execution; nothing masked |
APC_LEVEL | 1 | 1 | APC delivery and page-fault handling |
DISPATCH_LEVEL | 2 | 2 | Thread scheduler / DPC queue |
CMC_LEVEL | 3 | — | Correctable Machine Check |
| Device IRQLs (DIRQL) | 4–11 | 3–26 | Hardware device interrupts |
CLOCK_LEVEL | 13 | 28 | System clock timer |
IPI_LEVEL / DRS_LEVEL | 14 | 29 | Inter-Processor Interrupt |
POWER_LEVEL | 15 | 30 | Power failure |
PROFILE_LEVEL / HIGH_LEVEL | 15 | 31 | Profiling / highest maskable |
Higher value = higher priority. A device interrupt at DIRQL 8 preempts a DPC at DISPATCH_LEVEL (2), which itself preempts ordinary thread code at PASSIVE_LEVEL (0).

3. Software IRQLs: PASSIVE, APC, and DISPATCH
The lowest three levels are software IRQLs — the kernel raises and lowers them without involving the interrupt controller.
PASSIVE_LEVEL (0) masks nothing. This is where normal kernel-mode thread code runs: DriverEntry, AddDevice, Unload, most dispatch routines, and driver-created worker threads. All blocking, paging, and synchronization primitives are available.
APC_LEVEL (1) masks Asynchronous Procedure Call interrupts only. The sole functional difference from PASSIVE_LEVEL is that APCs cannot interrupt the running code. Both levels imply a valid thread context and both permit access to pageable memory. Page-fault handling itself runs at APC_LEVEL.
DISPATCH_LEVEL (2) masks DISPATCH_LEVEL and APC_LEVEL. Critically, the thread scheduler is disabled — code here owns the processor until it lowers IRQL. Routines such as StartIo, DpcForIsr, IoTimer, Cancel (holding the cancel spin lock), and all DPC callbacks run here. Two hard rules apply: no access to paged memory, and no blocking waits.
| Feature | PASSIVE_LEVEL | APC_LEVEL | DISPATCH_LEVEL |
|---|---|---|---|
| Thread context | Yes | Yes | Not guaranteed |
| Scheduler active | Yes | Yes | No |
| Paged pool access | Yes | Yes | No |
| Blocking waits allowed | Yes | Yes | No |
4. Hardware IRQLs: DIRQL and Above
Levels at or above the device range are hardware IRQLs driven by the interrupt controller. A driver’s Device IRQL (DIRQL) is the SynchronizeIrql stored in its _KINTERRUPT object. When a device fires, the processor raises to that DIRQL and invokes the Interrupt Service Routine (ISR), a KSERVICE_ROUTINE.
At DIRQL, all interrupts at or below the driver’s level are masked, but higher-DIRQL devices, the clock, and power-failure interrupts may still preempt. Because the scheduler and lower-priority interrupts are blocked, ISRs must be minimal — they acknowledge the hardware, capture volatile state, and queue a DPC for the heavy lifting at DISPATCH_LEVEL.
Above DIRQL sit CLOCK_LEVEL, IPI_LEVEL (used by one processor to interrupt another), POWER_LEVEL, and HIGH_LEVEL. The general principle: the higher the IRQL, the shorter the code must run. Sustained work at high IRQL starves the entire processor.
// KSERVICE_ROUTINE - runs at DIRQL; must be minimal
BOOLEAN MyInterruptServiceRoutine(
PKINTERRUPT Interrupt, PVOID ServiceContext) {
// Acknowledge hardware, then defer heavy work to a DPC.
// Do NOT touch paged memory here.
IoRequestDpc(MyDeviceObject, MyDeviceObject->CurrentIrp, ServiceContext);
return TRUE;
}5. Kernel APIs for IRQL Management
Drivers query and adjust IRQL through a small, exported API surface in wdm.h.
| API Function | Purpose |
|---|---|
KeGetCurrentIrql() | Returns the current processor IRQL; callable at any IRQL |
KeRaiseIrql(NewIrql, &OldIrql) | Raises to NewIrql; saves prior level. NewIrql must be ≥ current |
KeLowerIrql(OldIrql) | Restores a previously saved IRQL — only after a matching raise |
KeRaiseIrqlToDpcLevel() | Raises to DISPATCH_LEVEL, returns old IRQL |
KeAcquireSpinLock(&Lock, &OldIrql) | Acquires spin lock, raising to DISPATCH_LEVEL |
KeReleaseSpinLock(&Lock, OldIrql) | Releases lock, restoring saved IRQL |
KeAcquireSpinLockAtDpcLevel(&Lock) | Acquires lock without raising (caller already at DISPATCH_LEVEL) |
The exact signatures:
KIRQL KeGetCurrentIrql(void);
void KeRaiseIrql(
_In_ KIRQL NewIrql,
_Out_ PKIRQL OldIrql
);
void KeLowerIrql(_In_ KIRQL NewIrql); // restore saved old IRQL
KIRQL KeRaiseIrqlToDpcLevel(void);The raise/lower discipline is enforced: calling KeRaiseIrql with a value lower than the current IRQL is a fatal error, and KeLowerIrql may only restore the level a prior KeRaiseIrql saved.
// Demonstrates the raise/lower stack discipline
VOID MyFunctionNeedingDispatchLevel(VOID) {
KIRQL oldIrql;
KeRaiseIrql(DISPATCH_LEVEL, &oldIrql);
// --- Critical section: no paged pool access here ---
KeLowerIrql(oldIrql);
}Spin locks couple mutual exclusion with IRQL: acquiring one raises to DISPATCH_LEVEL so the holder cannot be preempted by the scheduler on its processor.
KSPIN_LOCK MySpinLock;
KIRQL oldIrql;
KeInitializeSpinLock(&MySpinLock);
// KeAcquireSpinLock raises to DISPATCH_LEVEL internally
KeAcquireSpinLock(&MySpinLock, &oldIrql);
// ... protected shared-data access (non-paged only) ...
KeReleaseSpinLock(&MySpinLock, oldIrql); // restores oldIrqlA driver inspecting its own context queries the level directly:
// Demonstrates KeGetCurrentIrql() usage and KIRQL type
NTSTATUS DriverDispatchCreate(PDEVICE_OBJECT DeviceObject, PIRP Irp) {
KIRQL currentIrql = KeGetCurrentIrql();
// Expected: PASSIVE_LEVEL (0) in a dispatch routine
DbgPrint("[MyDriver] Current IRQL: %u\n", (ULONG)currentIrql);
// ...complete IRP...
}6. Memory Access Rules at Each IRQL
The single most consequential IRQL rule concerns paged memory. Any routine running above APC_LEVEL that touches paged pool causes a fatal page fault. Resolving a page fault requires the file-system driver to read from disk — an operation that needs a context switch, which is impossible once the scheduler is disabled at DISPATCH_LEVEL.
| Memory Pool | PASSIVE_LEVEL | APC_LEVEL | DISPATCH_LEVEL+ |
|---|---|---|---|
| Paged pool | Accessible | Accessible | Fatal page fault |
| Non-paged pool | Accessible | Accessible | Accessible |
Code at or above DISPATCH_LEVEL must therefore allocate from non-paged pool and operate only on locked or non-pageable memory (for example, buffers locked with MmProbeAndLockPages). Violating this rule produces the most common driver bug check — IRQL_NOT_LESS_OR_EQUAL (0x0000000A), or its driver-attributed variant 0x000000D1.
7. DPCs: The DISPATCH_LEVEL Workhorses
A Deferred Procedure Call (DPC) moves work out of the time-critical ISR into DISPATCH_LEVEL. The ISR queues a _KDPC object (via IoRequestDpc or KeInsertQueueDpc); the kernel drains the DPC queue as IRQL drops below DISPATCH_LEVEL. DpcForIsr handles per-IRP completion; CustomDpc and CustomTimerDpc serve driver-specific needs.
// KDEFERRED_ROUTINE - runs at DISPATCH_LEVEL
VOID MyDpcRoutine(
PKDPC Dpc, PVOID DeferredContext,
PVOID SystemArgument1, PVOID SystemArgument2) {
// Safe: non-paged pool only.
// Do NOT call KeWaitForSingleObject with a nonzero timeout.
DbgPrint("[MyDpc] Running at DISPATCH_LEVEL\n");
}A DPC that runs too long throttles the whole system and triggers DPC_WATCHDOG_VIOLATION (0x00000133) once sustained execution exceeds the watchdog threshold.

8. APCs: The APC_LEVEL Mechanism
An Asynchronous Procedure Call (APC) executes a function in the context of a specific thread. Kernel APCs run at APC_LEVEL; user APCs are delivered when a thread returns to PASSIVE_LEVEL in a user-mode alertable wait. Drivers initialize them with KeInitializeApc and queue them with KeInsertQueueApc. Because APC_LEVEL still implies a valid thread context and permits paged access, certain dispatch routines raise to APC_LEVEL to serialize against APC delivery while remaining able to page in data.
9. Debugging IRQL With WinDbg
WinDbg exposes IRQL state on both live kernels and crash dumps.
; Check current IRQL on each processor
!irql
; Examine the KPCR for processor 0
!pcr 0
; List pending DPCs
!dpcs
; Analyze a 0x0000000A bugcheck
!analyze -vOn x64 the IRQL is the CR8 register; you can read it and the _KPCR directly:
; dt = display type; shows _KPCR struct at GS base
dt nt!_KPCR @$pcr
; On x64, IRQL maps to CR8 (Task Priority Register)
r cr8The IRQL contract is also expressed statically through SAL annotations in wdm.h, which static-analysis tooling verifies at build time:
// Illustrates IRQL annotation macros from wdm.h
_IRQL_requires_max_(DISPATCH_LEVEL)
VOID MyRoutineSafeAtOrBelowDispatch(VOID);
_IRQL_requires_(PASSIVE_LEVEL)
VOID MyRoutineRequiresPassive(VOID);
_IRQL_raises_(DISPATCH_LEVEL)
_IRQL_saves_
KIRQL MyRaiseRoutine(VOID);10. IRQL in a Security Context
IRQL semantics become a security concern the moment attacker code reaches ring 0. Code running at DISPATCH_LEVEL owns its processor and is invisible to user-mode EDR hooks — an ideal vantage point for unhooking the SSDT, overwriting kernel callbacks, or hiding objects before defensive software can react. Because paged access above APC_LEVEL is fatal, IRQL violations also serve as a crude denial-of-service primitive: a single bad page touch produces an IRQL_NOT_LESS_OR_EQUAL blue screen.
The dominant delivery vector is Bring Your Own Vulnerable Driver (BYOVD) — loading a legitimately signed but exploitable driver to obtain kernel-IRQL execution without writing a new signed driver. Missing or incorrect IRQL SAL annotations frequently mask the very bugs these attacks exploit.

11. Common Attacker Techniques
| Technique | Description |
|---|---|
| BYOVD kernel execution | Load a signed-but-vulnerable driver (e.g. RTCore64.sys, dbutil_2_3.sys) to run code at kernel IRQL |
EDR unhooking at DISPATCH_LEVEL | Patch SSDT entries or kernel callbacks while the scheduler is disabled, beating re-hook races |
| Rootkit concealment | Hide processes, files, and connections from DIRQL/DISPATCH_LEVEL, below user-mode visibility |
| Spin-lock starvation | Hold a spin lock at DISPATCH_LEVEL to monopolize a processor — driver-stack DoS |
| Deliberate IRQL fault | Force paged access above APC_LEVEL to bug-check the host (0x0000000A DoS) |
| DSE downgrade | Flip test-signing or pre-release flags to load unsigned kernel code |
12. Defensive Strategies & Detection
Driver loads are the chokepoint. Sysmon Event ID 6 (Driver Loaded) records ImageLoaded, Hashes, Signed, Signature, and SignatureStatus — the fields that expose unsigned or anomalously signed drivers and known-vulnerable BYOVD payloads. Event ID 7045 (and System log 7036/7040) surface drivers registered as services. PatchGuard violations of _KPCR/IDT/SSDT raise bug check 0x00000109 (CRITICAL_STRUCTURE_CORRUPTION); HVCI/Code-Integrity blocks land in Microsoft-Windows-CodeIntegrity/Operational (Event IDs 3001–3089) and Security Event ID 5038.
A starting Sigma rule for vulnerable-driver loads:
title: Suspicious Vulnerable Driver Load (Possible BYOVD)
logsource:
product: windows
service: sysmon
detection:
selection_unsigned:
EventID: 6
Signed: 'false'
selection_known_vuln:
EventID: 6
ImageLoaded|endswith:
- '\RTCore64.sys'
- '\dbutil_2_3.sys'
condition: selection_unsigned or selection_known_vuln
level: highISR/DPC behavior can be traced through the NT Kernel Logger ETW provider with interrupt and DPC flags enabled:
xperf -on Base+Interrupt+DPC
xperf -d trace.etlHardening layers: enforce Driver Signature Enforcement and HVCI (M1048) so unsigned or tampered drivers cannot load even on a compromised kernel; enable the Microsoft Vulnerable Driver Blocklist (HKLM\SYSTEM\CurrentControlSet\Control\CI\Config\VulnerableDriverBlocklistEnable); restrict SeLoadDriverPrivilege to administrators (M1026); and run suspect drivers under Driver Verifier in a VM to force IRQL checks. Monitor bcdedit test-signing changes and the CI\Config registry path for downgrade attempts.
MITRE ATT&CK Mapping
| Technique | MITRE ID | Detection |
|---|---|---|
| Rootkit | T1014 | Sysmon EID 6 unsigned/anomalous drivers; HVCI logs |
| Create System Process: Service | T1543.003 | EID 7045 / System 7036 driver-service install |
| Impair Defenses: Disable Tools | T1562.001 | EDR callback integrity, PatchGuard 0x109 |
| Impair Defenses: Downgrade | T1562.010 | CI\Config registry + bcdedit test-signing audit |
| Exploitation for Priv-Esc | T1068 | BYOVD load (EID 6) preceding kernel-write activity |
| Escape to Host | T1611 | Kernel-IRQL execution from container context |
13. Tools for IRQL Analysis
| Tool | Description | Link |
|---|---|---|
| WinDbg | !irql, !pcr, !dpcs, !analyze -v on bug checks | microsoft.com |
| Driver Verifier | Forces IRQL/pool/deadlock checks on a target driver | microsoft.com |
| Sysmon | Driver-load (EID 6) and service (7045) telemetry | microsoft.com |
| xperf / WPA | ETW interrupt and DPC tracing | microsoft.com |
| Process Hacker | Live driver and kernel-module enumeration | processhacker.sourceforge.io |
| Volatility | Memory-forensic driver and callback inspection | volatilityfoundation.org |
| Ghidra | Static analysis of suspect driver binaries | ghidra-sre.org |
Summary
- IRQL is a per-processor priority register that gates which kernel routines code may legally call and which interrupts are masked.
- The HAL maps hardware vectors onto 16 IRQLs on x64 and 32 on x86; higher value preempts lower, and raising/lowering must follow strict stack discipline.
- Above
APC_LEVELthe scheduler is disabled and paged memory is off-limits — touching it triggersIRQL_NOT_LESS_OR_EQUAL(0x0000000A). - Attackers reach kernel IRQL through BYOVD to unhook EDR, conceal rootkits, or bug-check the host as a DoS — mapped to
T1014,T1543.003,T1562.001, andT1068. - Detect via Sysmon Event ID 6, the vulnerable-driver blocklist, HVCI/DSE enforcement, and
SeLoadDriverPrivilegerestriction.
Related Tutorials
- Windows Scheduler Internals: Priority Levels, Quantum, and Thread Selection
- DPCs: Deferred Procedure Calls and Interrupt Deferral
- Access Tokens and Privileges: The Kernel’s Security Context
- SIDs and Security Descriptors: Identity in Windows Security
- Fibers: User-Mode Cooperative Threads
References
- Managing Hardware Priorities (IRQL Levels) — Windows Kernel Driver Docs | Microsoft Learn
- Always Preemptible and Always Interruptible — Windows Kernel Driver Docs | Microsoft Learn
- IRQL Annotations for Drivers — Windows Driver Testing | Microsoft Learn
- !irql Extension Command (WinDbg Kernel Debugger) | Microsoft Learn
- Dispatch Routines and IRQLs — Windows Kernel Driver Docs | Microsoft Learn
- Guidelines for Writing DPC Routines (DISPATCH_LEVEL IRQL) | Microsoft Learn
System Calls and SSDT: How User Mode Reaches the Kernel
Objective: Understand how Windows user-mode code transitions to ring 0 via the
SYSCALLinstruction, how the System Service Descriptor Table (SSDT) dispatches those calls, and why SSDT hooking, direct syscalls, and modern kernel hardening (PatchGuard, HVCI, MWTI ETW) are central to both offensive tradecraft and defensive telemetry.
1. Why System Calls Exist
User-mode code runs at CPL 3 (ring 3). The kernel runs at CPL 0 (ring 0). Privileged operations — opening another process, mapping physical pages, accessing the file system, talking to drivers — require ring 0. The CPU enforces this with segment descriptors and page-table permissions; a direct CALL into kernel memory from user mode faults immediately.
The bridge is a controlled transition: the user-mode side specifies what it wants by number, the CPU switches to ring 0 at a fixed, kernel-controlled entry point, and the kernel validates and dispatches. That number is the System Service Number (SSN), and the dispatch table is the SSDT.
This design has two consequences that drive everything in this post:
- The kernel entry point is fixed and well-known, so an attacker who can write to ring 0 memory (a kernel rootkit) can redirect every syscall by patching one table.
- The user-mode side of the syscall (the stub in
ntdll.dll) is not privileged, so an EDR can hook it — and a red teamer can bypass that hook by issuing theSYSCALLinstruction themselves.
2. The Mechanics of SYSCALL on x64
SYSCALL is a dedicated x86-64 instruction designed for fast ring-3 → ring-0 transitions. It does not use the legacy interrupt gate (int 2Eh); it reads MSRs and jumps.
| MSR | Address | Role |
|---|---|---|
IA32_LSTAR | 0xC0000082 | Kernel RIP to jump to on SYSCALL from 64-bit user mode. Holds KiSystemCall64 (or KiSystemCall64Shadow with KPTI). |
IA32_STAR | 0xC0000081 | Encodes the kernel and user CS/SS selectors for SYSCALL/SYSRET. |
IA32_FMASK | 0xC0000084 | RFLAGS mask — bits cleared on entry (notably IF, masking interrupts during the prologue). |
The x64 Windows syscall ABI:
EAXholds the SSN (the index intoKiServiceTable).R10holds the first argument. The user-mode stub copiesRCXintoR10becauseSYSCALLitself clobbersRCXwith the returnRIP.RDX,R8,R9, then stack — match the standard x64 calling convention for the remaining arguments.
A minimal user-mode stub, exactly as ntdll lays it out:
; NtFooBar — illustrative ntdll-style syscall stub (x64)
NtFooBar:
mov r10, rcx ; SYSCALL clobbers RCX; preserve arg0 in R10
mov eax, 0x???? ; SSN — VERSION-SPECIFIC, resolve at runtime
syscall ; ring-3 -> ring-0 via LSTAR
ret ; SYSRET returns hereThe 32-bit predecessor was SYSENTER (with entry stored in IA32_SYSENTER_EIP). On modern 64-bit Windows, SYSENTER is only relevant inside the Wow64 path.

3. KiSystemCall64: The Kernel Entry Point
When the CPU executes SYSCALL from user mode:
- It loads
RIPfromIA32_LSTAR(→KiSystemCall64). - It loads
CS/SSfromIA32_STAR(kernel selectors). - It saves the old user
RIPinRCXand oldRFLAGSinR11. - It clears
RFLAGSbits perIA32_FMASK.
KiSystemCall64 then:
- Swaps
GSviaSWAPGSto access the per-CPUKPCR. - Switches from the user stack to the kernel stack stored in the
KPCR. - Builds a
KTRAP_FRAMEcapturing the user context. - Indexes
KeServiceDescriptorTable(or the Shadow variant for Win32k GUI calls) usingEAX. - Calls the resolved
Nt*function. - On return, restores the frame and executes
SYSRETto drop back to ring 3.
Selected KTRAP_FRAME fields (see WDK wdm.h for the full layout):
| Field | Description |
|---|---|
Rip | Saved user-mode instruction pointer (from RCX at entry). |
Rsp | Saved user-mode stack pointer. |
EFlags | Saved RFLAGS (from R11). |
ErrCode | Processor error code; 0 for syscalls. |
With Kernel Page-Table Isolation (KPTI) active, IA32_LSTAR points instead at KiSystemCall64Shadow, a thin trampoline that swaps from the user CR3 (which maps only a minimal kernel trampoline) to the full kernel CR3 before falling through into the normal dispatcher. This is the Meltdown mitigation.
4. The SSDT and KSERVICE_TABLE_DESCRIPTOR
The “SSDT” in casual use refers to two related objects:
| Symbol | Description |
|---|---|
KeServiceDescriptorTable | Exported KSERVICE_TABLE_DESCRIPTOR. Covers the core Nt* services in ntoskrnl.exe. |
KeServiceDescriptorTableShadow | Not exported. Adds a second entry for win32k!W32pServiceTable — the GUI/USER/GDI syscall surface. Rootkits historically located it by pattern scanning around KeAddSystemServiceTable or via debugger symbols. |
KiServiceTable | The actual function-pointer table referenced by the descriptor. |
KiArgumentTable | Parallel array of argument byte counts per service. |
Approximate layout from public symbols:
typedef struct _KSERVICE_TABLE_DESCRIPTOR {
PULONG_PTR ServiceTable; // -> KiServiceTable (encoded offsets on x64)
PULONG CounterTable; // call counters (typically NULL in retail)
ULONG TableSize; // number of services
PUCHAR ArgumentTable; // bytes of stack args per service
} KSERVICE_TABLE_DESCRIPTOR, *PKSERVICE_TABLE_DESCRIPTOR;The SSN (EAX) is split: the low 12 bits index the table, and bit 12 selects which descriptor — 0 for KeServiceDescriptorTable, 1 for the Win32k shadow table. This is how GUI syscalls (NtUserCreateWindowEx, NtGdiBitBlt, …) coexist with kernel-proper syscalls in the same SSN space.

5. The x64 Encoded-Offset Format
A critical detail anyone writing an SSDT scanner gets wrong the first time: on x64 Windows, KiServiceTable entries are not function pointers. Each entry is a 32-bit value encoding a signed offset from the base of KiServiceTable itself, with the low 4 bits used to communicate the argument-count category to the dispatcher.
The decode is:
// Recover the real Nt* function address from KiServiceTable[i]
ULONG_PTR DecodeSsdtEntry(PULONG ServiceTable, ULONG index)
{
LONG encoded = (LONG)ServiceTable[index]; // signed 32-bit
LONG offset = encoded >> 4; // arithmetic shift
return (ULONG_PTR)ServiceTable + offset; // base + offset
}The arithmetic right shift matters — it preserves the sign, allowing functions located before KiServiceTable in memory to be addressed. A naive unsigned >> 4 will silently miss those entries and produce a corrupt scanner.
6. Tracing a Syscall End-to-End: NtOpenProcess
Following an OpenProcess call from a user-mode debugger target:
kernel32!OpenProcess
└─> kernelbase!OpenProcess
└─> ntdll!NtOpenProcess ; the syscall stub
mov r10, rcx
mov eax, <SSN> ; version-specific
syscall
ret
─────────── ring 3 / ring 0 boundary ───────────
CPU: RIP <- LSTAR (KiSystemCall64[Shadow])
nt!KiSystemCall64
├─ SWAPGS, switch to kernel stack
├─ build KTRAP_FRAME
├─ idx = EAX & 0xFFF
├─ desc = (EAX & 0x1000) ? Shadow : KeServiceDescriptorTable
├─ fn = desc->ServiceTable + (desc->ServiceTable[idx] >> 4)
└─ call nt!NtOpenProcess
nt!NtOpenProcess
├─ ObReferenceObjectByName / ByHandle
├─ SeAccessCheck (DesiredAccess vs token)
└─ ObOpenObjectByPointer -> HANDLE
SYSRET back to user-mode RIP saved in RCXThe SSN for NtOpenProcess changes between Windows builds; never hardcode it. Tooling either resolves it from the on-disk ntdll.dll, parses the in-memory stub, or consults a versioned table such as j00ru’s syscall reference.
A practical SSN extractor parses the Nt* export’s first instructions and reads the MOV EAX, imm32 (B8 xx xx xx xx) byte pattern:
# Parse SSNs from a clean on-disk ntdll.dll (illustrative)
import pefile, struct
pe = pefile.PE(r"C:\Windows\System32\ntdll.dll", fast_load=False)
pe.parse_data_directories()
image = pe.get_memory_mapped_image()
for exp in pe.DIRECTORY_ENTRY_EXPORT.symbols:
name = exp.name.decode() if exp.name else ""
if not name.startswith("Nt"):
continue
stub = image[exp.address: exp.address + 24]
# Classic stub: 4C 8B D1 B8 ss ss 00 00 F6 04 25 ... 0F 05 C3
if stub[0:3] == b"\x4c\x8b\xd1" and stub[3] == 0xB8:
ssn = struct.unpack("<I", stub[4:8])[0]
print(f"{name:40s} SSN=0x{ssn:04x}")Red-team loaders use the same idea at runtime — sometimes against a fresh copy of ntdll read from disk to defeat in-memory EDR hooks (the “Perun’s Fart” / fresh-copy pattern).
7. Wow64 and Heaven’s Gate
A 32-bit process on 64-bit Windows still ultimately issues a 64-bit SYSCALL, because the only kernel entry the CPU honors from a 64-bit process is KiSystemCall64. The Wow64 layer bridges this:
32-bit app -> wow64cpu!CpupReturnFromSimulatedCode
-> far jmp 0x33:<addr> ; CS=0x23 (32-bit) -> CS=0x33 (64-bit)
-> wow64.dll / 64-bit ntdll
-> SYSCALLThe 0x33 / 0x23 CS selector switch is the so-called Heaven’s Gate (community label, not an official Microsoft term). Malware abuses it to:
- Execute 64-bit shellcode from a process that defenders are monitoring as a 32-bit target.
- Issue syscalls that bypass 32-bit ntdll hooks if the EDR only instruments the Wow64 layer.
Analysts should treat any unexpected far jmp to CS=0x33 in 32-bit code as a strong IOC.
8. SSDT Hooking: The Classic Rootkit Technique
Pre-Vista x64, kernel rootkits manipulated KiServiceTable directly:
- Locate the descriptor (
KeServiceDescriptorTableis exported; the Shadow descriptor was pattern-scanned). - Disable write protection (clear
CR0.WP) or remap the page as writable. - Save the original entry for the target SSN (e.g.,
NtQueryDirectoryFile,NtEnumerateValueKey). - Overwrite the entry with a pointer to attacker code.
- The hook calls the original after filtering results — hiding files, registry keys, processes, or network connections.
The illustrative read-only inspection (do not modify) inside a signed test driver:
extern PKSERVICE_TABLE_DESCRIPTOR KeServiceDescriptorTable;
VOID DumpSsdtSizeAndSample(VOID)
{
PKSERVICE_TABLE_DESCRIPTOR d = KeServiceDescriptorTable;
PULONG table = (PULONG)d->ServiceTable;
DbgPrint("[SSDT] TableSize = %lu\n", d->TableSize);
for (ULONG i = 0; i < 4 && i < d->TableSize; i++) {
LONG enc = (LONG)table[i];
ULONG_PTR addr = (ULONG_PTR)table + (enc >> 4);
DbgPrint("[SSDT] [%lu] encoded=0x%08x -> 0x%p\n", i, enc, (PVOID)addr);
}
}
// Reading LSTAR to confirm KiSystemCall64[Shadow]
VOID DumpLstar(VOID)
{
ULONG64 lstar = __readmsr(0xC0000082);
DbgPrint("[MSR] IA32_LSTAR = 0x%llx (KiSystemCall64[Shadow])\n", lstar);
}Live inspection from WinDbg on a kernel-debugged target:
0: kd> dt nt!_KSERVICE_TABLE_DESCRIPTOR nt!KeServiceDescriptorTable
0: kd> dq nt!KeServiceDescriptorTable L4
0: kd> dd nt!KiServiceTable L20
0: kd> u poi(nt!KiServiceTable) L5
0: kd> rdmsr c00000829. PatchGuard (KPP) and Why SSDT Hooking Died
Since x64 Vista, Kernel Patch Protection periodically validates a set of protected structures, including KiServiceTable, IDT, GDT, MSR_LSTAR, kernel image code sections, and several driver objects. On mismatch, KPP issues bugcheck 0x109 — CRITICAL_STRUCTURE_CORRUPTION. The checks run from randomized timers and contexts to resist disablement.
The practical result:
- SSDT hooking is no longer a viable persistence or hiding primitive on supported 64-bit Windows. Any survival window is short and ends in a BSOD.
- Modern kernel-mode attackers use driver callbacks (
PsSetCreateProcessNotifyRoutine,ObRegisterCallbacks, minifilters) rather than SSDT patching, because those are the supported extension points and are not policed by KPP. - With HVCI/Memory Integrity enabled, even loading the malicious driver is gated: kernel pages cannot be both writable and executable, and unsigned kernel code cannot enter ring 0 at all. The hypervisor enforces this at the EPT level — PatchGuard becomes a second line, not the first.
10. Direct and Indirect Syscalls (Modern Red Team TTPs)
Because KPP closed the kernel-side door, evasion moved into user mode. Many EDRs hook the Nt* stubs in ntdll.dll by overwriting the first bytes with a JMP into their inspection DLL. Two techniques bypass that:
- Direct syscalls. The loader embeds its own
mov eax, ssn; syscall; retstub in attacker memory and calls it instead ofntdll!NtXxx. The hooked ntdll is never touched. SSNs are resolved at runtime (parsing ntdll, sortingNt*exports by address — the “Hell’s Gate” / “Halo’s Gate” patterns). - Indirect syscalls. The
mov eax, ssnhappens in attacker memory, but thesyscallinstruction itself is reached by jumping to thesyscallbyte sequence insidentdll.dll. The kernel-side return address therefore points back into ntdll, matching what legitimate code looks like in stack-walk telemetry.
The detection signal flips between the two:
| Technique | What it bypasses | What still sees it |
|---|---|---|
| Direct syscall | ntdll user-mode hooks | Stack walk shows syscall from unbacked / private memory. |
| Indirect syscall | ntdll hooks and naive stack-walk checks | Kernel ETW (Microsoft-Windows-Threat-Intelligence) sees the syscall regardless of where it was issued from. |
ETW-TI is the answer to indirect syscalls: it fires from inside the kernel dispatcher, after the SYSCALL has already landed in KiSystemCall64, so the user-mode evasion is irrelevant.

11. Common Attacker Techniques
| Technique | Description |
|---|---|
| SSDT hook (legacy) | Overwrite KiServiceTable[SSN] to filter results for hiding rootkit artifacts; killed by PatchGuard on x64. |
| Shadow SSDT hook | Same against W32pServiceTable to intercept GUI/keyboard/clipboard syscalls. |
| Direct syscall stub | Embedded mov eax, ssn; syscall in attacker memory to bypass ntdll hooks. |
| Indirect syscall | Jump to the syscall gadget inside ntdll so call stacks look legitimate. |
| Hell’s Gate / Halo’s Gate | Runtime SSN resolution by parsing/sorting Nt* exports in mapped ntdll. |
| Fresh-copy ntdll | Read clean ntdll.dll from disk to re-derive unhooked stubs and SSNs. |
| Heaven’s Gate | Far jump from 32-bit (CS=0x23) to 64-bit (CS=0x33) to execute 64-bit syscalls from a Wow64 process. |
| Driver-based hooking | Where HVCI is off, signed-but-vulnerable drivers (“BYOVD”) are used to write to MSRs or protected pages. |
12. Defensive Strategies & Detection
The detection model has shifted from “watch the SSDT” (PatchGuard already does that) to watch how syscalls are issued from user mode and consume kernel ETW.
Sysmon
| Event ID | Field | Why it matters |
|---|---|---|
1 | ParentImage, CommandLine | Baseline; correlates injection target lineage. |
10 | GrantedAccess, CallTrace | The CallTrace field is the primary direct-syscall tell — legitimate stacks contain ntdll.dll; direct syscalls show UNKNOWN(...) or RWX private memory regions. |
25 | — | Process image tampering / hollowing. |
Sigma — direct-syscall NtOpenProcess against LSASS
title: Process Access to LSASS via Direct Syscall (Unbacked Call Stack)
id: 8d0c2a4e-syscall-lsass-unbacked
status: experimental
logsource:
product: windows
service: sysmon
detection:
selection:
EventID: 10
TargetImage|endswith: '\lsass.exe'
GrantedAccess:
- '0x1010'
- '0x1410'
- '0x1fffff'
unbacked:
CallTrace|contains:
- 'UNKNOWN'
- 'UNKNOWN('
filter_legit:
SourceImage|endswith:
- '\MsMpEng.exe'
- '\MsSense.exe'
condition: selection and unbacked and not filter_legit
level: high
tags:
- attack.credential_access
- attack.t1003.001
- attack.t1106ETW Providers Worth Subscribing To
| Provider | Use |
|---|---|
Microsoft-Windows-Threat-Intelligence | Kernel ETW provider exposing AllocVm, ProtectVm, MapViewOfSection, ReadVm/WriteVm events. Fires from inside the kernel dispatcher, so direct and indirect syscalls are still visible. Consumer must run as PPL. |
Microsoft-Windows-Kernel-Process | Process and thread creation, image loads. |
Microsoft-Windows-Kernel-Audit-API-Calls | Audits selected Nt API calls (verify against current SDK). |
Audit Policy
- Audit Sensitive Privilege Use — catches
SeDebugPrivilegeenabling, a near-universal precursor to syscall-based cross-process injection. - Audit Process Creation with command-line capture.
- Audit Handle Manipulation with object SACLs on
lsass.exe.
Hardening
- HVCI / Memory Integrity — single highest-value control. Blocks unsigned and W^X-violating kernel code; defeats BYOVD primitives that try to disable PatchGuard, patch the SSDT, or clear
CR0.WP. - VBS + Credential Guard — keeps LSASS secrets off the path even if a syscall reaches
NtOpenProcess. - KPTI — Meltdown mitigation; also implies
KiSystemCall64Shadowis the LSTAR target. - Driver Signature Enforcement + Microsoft vulnerable-driver blocklist — limits BYOVD options.
- EDR ntdll instrumentation — still valuable as a low-cost filter against commodity malware; layer with kernel ETW for the sophisticated cases.
13. Tools for Syscall and SSDT Analysis
| Tool | Description | Link |
|---|---|---|
| WinDbg | Kernel debugger; resolves nt!KeServiceDescriptorTable, nt!KiServiceTable, reads MSRs via rdmsr. | learn.microsoft.com |
| Process Hacker | Live handle, thread, and module inspection; surfaces RWX private memory regions. | processhacker.sourceforge.io |
| Process Monitor | Boot-time and runtime Nt* activity captured via minifilter. | learn.microsoft.com |
| SysmonView / Sysmon | EID 10 CallTrace, EID 25 telemetry. | learn.microsoft.com |
| HollowsHunter / pe-sieve | Detects unbacked / hollowed / patched modules — strong correlator for direct-syscall loaders. | github.com/hasherezade |
| SwishDbgExt | WinDbg extension with SSDT dumping and decode of the encoded-offset format. | github.com |
| Volatility 3 | Memory forensics; windows.ssdt plugin walks the descriptor and decodes entries. | volatilityfoundation.org |
| j00ru syscall tables | Authoritative per-version SSN reference. | j00ru.vexillium.org |
| SilkETW / SealighterTI | User-friendly consumers for ETW providers including Microsoft-Windows-Threat-Intelligence. | github.com |
14. MITRE ATT&CK Mapping
| Technique | MITRE ID | Detection |
|---|---|---|
| Native API | T1106 | EID 10 CallTrace containing UNKNOWN; ETW-TI AllocVm/ProtectVm from unbacked memory. |
| Process Injection | T1055 | Cross-process NtAllocateVirtualMemory + NtWriteVirtualMemory + NtCreateThreadEx chain via ETW-TI. |
| DLL Injection | T1055.001 | EID 7/8 plus ETW-TI write/protect events into a remote PID. |
| PE Injection | T1055.002 | RWX private allocations followed by remote thread creation. |
| Process Hollowing | T1055.012 | NtUnmapViewOfSection followed by NtWriteVirtualMemory into the primary image base. |
| Rootkit | T1014 | PatchGuard 0x109 bugchecks; SSDT integrity scans in memory forensics. |
| Impair Defenses: Disable/Modify Tools | T1562.001 | Driver loads with revoked or vulnerable signatures; HVCI/DSE violations. |
Summary
- Every Windows syscall is a
SYSCALLinstruction that lands atKiSystemCall64viaMSR_LSTARand is dispatched throughKiServiceTableusing theEAXSSN. - The SSDT on x64 stores encoded offsets, not raw pointers —
base + (entry >> 4)— and theEAXbit 12 selects between the core and Win32k Shadow tables. - PatchGuard killed SSDT hooking on x64; modern offense has moved to direct and indirect syscalls in user mode and to BYOVD when ring 0 is required.
- HVCI/VBS is the strongest defense against the kernel half; kernel ETW (
Microsoft-Windows-Threat-Intelligence) is the strongest defense against direct/indirect syscalls because it fires after the transition. - Detect with Sysmon EID 10
CallTrace(unbacked memory in the stack), enrich with ETW-TI, and map to MITRE T1106 / T1055 for response.
Related Tutorials
- User Mode vs Kernel Mode: Privilege Rings and the Boundary
- Fibers: User-Mode Cooperative Threads
- Access Tokens and Privileges: The Kernel’s Security Context
- APCs: Asynchronous Procedure Calls and Thread Hijacking Surface
- DPCs: Deferred Procedure Calls and Interrupt Deferral
References
- Using Nt and Zw Versions of the Native System Services Routines — Microsoft Learn (Windows Drivers)
- Libraries and Headers (Ntdll.dll & System Calls) — Microsoft Learn (Windows Drivers)
- Native API (T1106) — MITRE ATT&CK Enterprise
- Input Capture: Credential API Hooking (T1056.004) — MITRE ATT&CK Enterprise
- Glimpse into SSDT in Windows x64 Kernel — Red Team Notes (ired.team)
- Exploring Malicious Windows Drivers (Part 1): Introduction to the Kernel and Drivers — Cisco Talos Intelligence
HAL and Ntoskrnl: The Kernel Core Components
Objective: Understand the architecture and division of labor between
hal.dll(the Hardware Abstraction Layer) andntoskrnl.exe(the NT kernel and Executive), how they are loaded during boot, the structures and routines each exposes, and how defenders inspect, detect tampering against, and harden these Ring 0 core components.
1. HAL and Ntoskrnl Overview
Two binaries sit at the bottom of Windows kernel mode and everything else builds on them. ntoskrnl.exe is the NT kernel plus the Executive — the policy and service layer of the OS. hal.dll is the Hardware Abstraction Layer — a thin platform shim that hides interrupt controllers, bus topology, timers, and DMA behind a uniform interface so the rest of the kernel stays hardware-independent.
| Binary | Full name | Loaded by | Ring |
|---|---|---|---|
ntoskrnl.exe | NT OS Kernel + Executive | winload.efi | Ring 0 |
hal.dll | Hardware Abstraction Layer | winload.efi | Ring 0 |
Both reside in %SystemRoot%\System32\. On multiprocessor systems the SMP-aware image ntkrnlmp.exe is selected by the loader and presented as ntoskrnl.exe; modern Windows 10/11 ships only the SMP variant. Verify image identity and signature on a live host with sigcheck, dumpbin /headers, or the WinDbg lm command. The separation exists for portability (HAL absorbs platform differences) and layering (the kernel implements scheduling and policy, not chipset quirks).
2. Boot Handoff: From Bootloader to KiSystemStartup
winload.efi loads ntoskrnl.exe and hal.dll into memory, then transfers control to the kernel entry point KiSystemStartup, passing a pointer to a LOADER_PARAMETER_BLOCK. That structure carries the memory descriptor list, the ARC hardware tree, NLS data, and other boot-time state the kernel needs before it can manage its own memory.
winload.efi
└─ loads ntoskrnl.exe + hal.dll
└─ ntoskrnl!KiSystemStartup(PLOADER_PARAMETER_BLOCK)
├─ HalInitializeProcessor() ; HAL brings up per-CPU hardware
├─ KiInitializeKernel() ; KPCR/KPRCB, IDT, GDT
├─ Executive phase init:
│ Mm/Ob/Se/Io/Cm/Ps InitSystem()
└─ PsInitialSystemProcess() ; System process (PID 4)
└─ Phase 1: smss.exe launchedHAL initializes the processor before the Executive runs a single line of policy code. Secure Boot validates the winload.efi → ntoskrnl.exe / hal.dll chain in firmware, so tampering with either binary on disk breaks the boot chain on a properly configured machine.

3. The HAL: Abstracting the Hardware
The HAL translates abstract requests into platform-specific operations: programming the APIC, translating bus-relative addresses, allocating DMA-coherent buffers, and calibrating the stall timer. Drivers and the kernel call HAL routines instead of touching hardware registers directly.
| Routine | Purpose |
|---|---|
HalGetInterruptVector | Translate a bus IRQ to a system interrupt vector and required IRQL |
HalTranslateBusAddress | Convert a bus-relative address to a logical address |
HalAllocateCommonBuffer | Allocate DMA-coherent memory visible to CPU and device |
KeStallExecutionProcessor | Calibrated busy-wait (HAL-implemented on most platforms) |
HalRequestSoftwareInterrupt | Request a software interrupt at a given IRQL to trigger DPC delivery |
On modern ACPI systems the HAL is far thinner than in the NT 4 era. Many classic Hal* exports such as HalGetInterruptVector are deprecated; the PnP/ACPI stack and IoConnectInterruptEx now handle interrupt wiring. Since Windows 8, HAL Extensions (halextpcat.dll, halextintc.dll, and similar PE images loaded by HAL itself) carry SoC- and OEM-specific code without replacing the whole HAL.
4. IRQL: The Kernel’s Preemption Ladder
Interrupt Request Level (IRQL) is the central arbitration mechanism shared by HAL and the kernel. The HAL programs the interrupt controller to enforce IRQL in hardware; running at an IRQL masks all interrupts at or below that level on the current CPU.
| IRQL (x64) | Symbolic name | Used for |
|---|---|---|
| 0 | PASSIVE_LEVEL | Normal thread execution |
| 1 | APC_LEVEL | APC delivery; paging allowed |
| 2 | DISPATCH_LEVEL | Scheduler, spin locks; no paging, no blocking |
| 3–12 | Device IRQLs | Hardware ISRs |
| 13 | CLOCK_LEVEL | Clock interrupt |
| 14 | PROFILE_LEVEL | Profiling interrupt |
| 15 | HIGH_LEVEL | NMI, machine check |
The cardinal rule: at DISPATCH_LEVEL or above you may not touch pageable memory or block, because the scheduler and page fault handler cannot run. A driver that dereferences paged-out memory at elevated IRQL produces the classic IRQL_NOT_LESS_OR_EQUAL bug check. Query the current level with KeGetCurrentIrql(). IRQL numeric values are architecture-specific; the table above is the canonical x64 mapping.

5. The Kernel Layer (Ke): Scheduling and Synchronization
The Ke layer sits directly above HAL and implements thread scheduling, interrupt and exception dispatch, and the low-level synchronization primitives the rest of the system depends on.
| Routine | What it does |
|---|---|
KeInitializeSpinLock | Initialize a spin-lock object |
KeAcquireSpinLock | Raise IRQL to DISPATCH_LEVEL and acquire the lock |
KeReleaseSpinLock | Release the lock and restore the saved IRQL |
KeInsertQueueDpc | Queue a Deferred Procedure Call |
KeWaitForSingleObject | Wait on a dispatcher object (event, mutex, timer, thread) |
KeSetEvent | Set a kernel event to the signaled state |
Dispatcher objects — events, mutexes, semaphores, timers, threads — share a common DISPATCHER_HEADER carrying Type, SignalState, and WaitListHead. The wait machinery keys off that header. The synchronization pattern below runs at PASSIVE_LEVEL, where blocking is legal:
KEVENT readyEvent;
KeInitializeEvent(&readyEvent, NotificationEvent, FALSE);
// ... another thread eventually calls KeSetEvent(&readyEvent, IO_NO_INCREMENT, FALSE);
NTSTATUS status = KeWaitForSingleObject(
&readyEvent, // dispatcher object
Executive, // wait reason
KernelMode, // processor mode
FALSE, // non-alertable
NULL); // no timeoutPer-CPU scheduler state lives in the KPCR (Kernel Processor Control Region), reachable via gs:[0] on x64, with an embedded KPRCB holding CurrentThread, NextThread, IdleThread, and the DPC queue.
6. The Executive Layer (Ex and Friends)
The Executive comprises the higher-level managers, each identified by a two-letter prefix. They build on Ke primitives and HAL services.
| Manager | Prefix | Responsibilities |
|---|---|---|
| Object Manager | Ob | Object lifecycle, handles, reference counting |
| Process/Thread Manager | Ps | EPROCESS/ETHREAD creation and teardown |
| Memory Manager | Mm | VAD trees, PTEs, page faults, pool |
| I/O Manager | Io | IRP lifecycle, driver loading |
| Security Reference Monitor | Se | Access checks, tokens, privileges |
| Configuration Manager | Cm | Registry hive management |
| Executive Support | Ex | Pool allocation, lookaside lists, callbacks |
Correct pool usage on modern Windows uses ExAllocatePool2 (the successor to ExAllocatePoolWithTag, deprecated starting Windows 10 build 19041) paired with ExFreePoolWithTag:
// Allocate non-paged pool with a 4-byte tag (read in WinDbg as 'XgAT').
PVOID buffer = ExAllocatePool2(POOL_FLAG_NON_PAGED, 0x1000, 'TAgX');
if (buffer != NULL) {
// ... use buffer at IRQL <= DISPATCH_LEVEL ...
ExFreePoolWithTag(buffer, 'TAgX');
}The Object Manager exposes ObReferenceObjectByHandle to convert a handle into a referenced kernel object pointer — the gateway every component crosses when validating access.
7. Key Kernel Structures
A handful of structures are the backbone of process, thread, and CPU state. Defenders and rootkit authors alike walk these every day.
| Structure | Key fields |
|---|---|
EPROCESS | UniqueProcessId, ActiveProcessLinks, Token, VadRoot, Peb, ImageFileName[15], ThreadListHead |
ETHREAD | Cid (CLIENT_ID), ThreadListEntry, Win32StartAddress, embedded KTHREAD |
KTHREAD | Header (DISPATCHER_HEADER), KernelStack, State, WaitIrql, Teb |
KPCR | Per-CPU; IRQL, IDT/GDT pointers, pointer to KPRCB |
KPRCB | CurrentThread, NextThread, IdleThread, DPC queue |
KDPC | DeferredRoutine, DeferredContext, DpcListEntry |
ActiveProcessLinks is a doubly linked LIST_ENTRY chaining every EPROCESS. The Task Manager view of “all processes” is, at bottom, a walk of this list. That makes it a prime DKOM target: unlinking an EPROCESS hides the process from list-based enumeration while it continues to run and be scheduled — covered in Section 10.
8. The SSDT and System Call Dispatch
A user-mode SYSCALL instruction transfers Ring 3 → Ring 0 and lands in ntoskrnl!KiSystemCall64. The dispatcher indexes the System Service Dispatch Table via KeServiceDescriptorTable, which points at KiServiceTable (an array of service routine offsets) and KiArgumentTable (argument byte counts). GUI calls into win32k.sys route through the shadow table KeServiceDescriptorTableShadow.
Patching KiServiceTable so a service index points at attacker code is the classic SSDT hook, historically used by rootkits to intercept NtQuerySystemInformation, NtOpenProcess, and similar. On x64 this is exactly the kind of structure modification PatchGuard validates, so SSDT hooking is loud and largely obsolete on modern systems — but understanding the dispatch path is essential for reading both live disassembly and integrity-check telemetry.

9. Live Analysis with WinDbg and Volatility
Load Microsoft symbols and the entire layout becomes navigable. List the core modules and dump structures directly:
0: kd> lm m nt ; ntoskrnl base, range, symbols
0: kd> lm m hal ; hal.dll base and range
0: kd> dt nt!_EPROCESS ; full EPROCESS field layout
0: kd> !process 0 0 ; enumerate processes via ActiveProcessLinks
0: kd> !pcr 0 ; KPCR for CPU 0
0: kd> !prcb 0 ; KPRCB: CurrentThread / IdleThread
0: kd> dps nt!KeServiceDescriptorTable ; SSDT pointer + service count
0: kd> !idt ; IDT vectors (HAL-programmed interrupt routing)For dead-box memory forensics, Volatility 3 reconstructs the same view from a dump and is the natural cross-check against a possibly compromised live host:
# Enumerate processes and loaded kernel modules from a memory image.
vol -f memory.dmp windows.pslist
vol -f memory.dmp windows.modules
# psscan walks pool tags instead of ActiveProcessLinks; a process that
# appears in psscan but NOT in pslist is a candidate DKOM-unlinked process.
vol -f memory.dmp windows.psscanA delta between windows.pslist (list-based) and windows.psscan (pool-scan-based) is a high-fidelity indicator of ActiveProcessLinks tampering.
10. Common Attacker Techniques
Kernel-core abuse turns on either modifying ntoskrnl structures from a loaded driver or exploiting a vulnerability to reach Ring 0 in the first place.
| Technique | Description |
|---|---|
| SSDT hooking | Patch KiServiceTable entries to intercept syscalls |
| DKOM unlinking | Splice an EPROCESS out of ActiveProcessLinks to hide a process |
| Kernel callback removal | Strip PsSetCreateProcessNotifyRoutine entries to blind EDR |
| BYOVD | Load a vulnerable signed driver to gain a Ring 0 primitive |
| Kernel exploitation | Abuse an ntoskrnl/HAL bug to escalate Ring 3 → Ring 0 |
| In-memory image patch | Patch ntoskrnl.exe code pages at runtime |
A malicious driver is still loaded through the documented path — a Services registry key of Type = 1 followed by a load — which is exactly where detection begins. Bring-Your-Own-Vulnerable-Driver remains popular precisely because it sidesteps the need to find a fresh kernel bug.

11. Defensive Strategies & Detection
Detection centers on driver loads, integrity events, and kernel structure cross-checks.
| Sysmon Event ID | Name | Relevance |
|---|---|---|
6 | Driver Loaded | Kernel driver load with Signed, Hashes, Signature fields |
7 | Image Loaded | Module loads in unusual contexts |
13 | Registry Value Set | New Services driver entries |
Pair Sysmon with Windows event sources: System Event ID 7045 (new kernel-mode service installed), Security Event ID 5038 (image hash invalid — DSE failure), and Event ID 6281 (page hash mismatch). The Microsoft-Windows-Kernel-Memory ETW provider surfaces pool allocations useful for hunting pool-based implants.
title: Suspicious Unsigned Kernel Driver Load
logsource:
product: windows
service: sysmon
detection:
selection:
EventID: 6
Signed: 'false'
filter_legit:
ImageLoaded|startswith:
- 'C:\Windows\System32\drivers\'
- 'C:\Windows\System32\DriverStore\'
condition: selection and not filter_legit
level: high| Mechanism | Description |
|---|---|
| PatchGuard (KPP) | Validates SSDT, IDT, GDT, KPCR, and kernel code; bug check 0x109 on tampering |
| Driver Signature Enforcement | ci.dll requires Authenticode-signed drivers |
| HVCI | VTL1 enforces signed Ring 0 code; blunts BYOVD and runtime patching |
| Secure Boot | Validates the winload → ntoskrnl/hal chain in firmware |
Operational hardening: enable HVCI (Core Isolation → Memory Integrity), confirm Secure Boot in msinfo32, audit SeLoadDriverPrivilege use, deploy the Microsoft Vulnerable Driver Blocklist (DriverSiPolicy.p7b), monitor HKLM\SYSTEM\CurrentControlSet\Services\ for new Type = 1 entries, and baseline loaded-module hashes against periodic WinPmem/Volatility snapshots.
12. MITRE ATT&CK Mapping
| Technique | MITRE ID | Detection |
|---|---|---|
| Rootkit | T1014 | Volatility pslist/psscan delta; PatchGuard bug check 0x109 |
| Kernel Modules and Extensions | T1547.006 | Sysmon EID 6; Event ID 7045; Services key writes |
| Exploitation for Privilege Escalation | T1068 | Crash telemetry, anomalous Ring 0 transitions |
| Impair Defenses | T1562.001 | Missing kernel callbacks; EDR self-protection alerts |
| Process Injection | T1055 | Kernel KeStackAttachProcess/MmCopyVirtualMemory use |
| Modify System Image | T1601.001 | Code integrity Event ID 5038/6281; PatchGuard |
13. Tools for Kernel Analysis
| Tool | Description | Link |
|---|---|---|
| WinDbg | Live and dump kernel debugging, structure walks | microsoft.com |
| Volatility 3 | Memory forensics, pslist/psscan/modules | volatilityfoundation.org |
| WinPmem | Live memory acquisition | github.com |
| Process Hacker | Driver and handle inspection | processhacker.sourceforge.io |
| Sysmon | Driver-load and registry telemetry | sysinternals.com |
| sigcheck | Image signature and hash verification | sysinternals.com |
| Ghidra | Static analysis of drivers and ntoskrnl | ghidra-sre.org |
14. Summary
- HAL and ntoskrnl are the two Ring 0 binaries every other Windows component is built on — HAL abstracts hardware, ntoskrnl implements the kernel and Executive policy layers.
- The kernel layer (
Ke) supplies scheduling and synchronization; the Executive (Ob,Ps,Mm,Io,Se,Cm,Ex) builds managers on top, all arbitrated by IRQL that the HAL enforces in hardware. - Core structures —
EPROCESS,ETHREAD,KPCR, the SSDT — are the backbone of process and CPU state and the prime targets for SSDT hooks, DKOM unlinking, and callback removal. - Detect kernel tampering via Sysmon Event ID
6, Event IDs7045/5038/6281, and Volatility pslist-vs-psscan deltas; prevent it with HVCI, DSE, Secure Boot, and the vulnerable-driver blocklist.
Related Tutorials
- Access Tokens and Privileges: The Kernel’s Security Context
- System Calls and SSDT: How User Mode Reaches the Kernel
- User Mode vs Kernel Mode: Privilege Rings and the Boundary
- SIDs and Security Descriptors: Identity in Windows Security
- Fibers: User-Mode Cooperative Threads
References
- Windows Kernel-Mode HAL Library – Microsoft Learn (Windows Drivers)
- Windows Kernel-Mode Kernel Library – Microsoft Learn (Windows Drivers)
- Overview of Windows Components (Kernel-Mode) – Microsoft Learn
- User Mode and Kernel Mode – Microsoft Learn (Windows Drivers)
- Boot or Logon Autostart Execution: Kernel Modules and Extensions (T1547.006) – MITRE ATT&CK
- Deeper into Windows Architecture (HAL, ntoskrnl, Executive) – Microsoft Learn Archive