Access Tokens and Privileges: The Kernel’s Security Context

Run whoami /priv on an admin shell. You’ll see a column labeled State, and most of the entries — including SeDebugPrivilege and SeImpersonatePrivilege — read Disabled. They aren’t missing. They’re sitting in the token, dormant, waiting for a BOOL flip. That single column is the entire story of most Windows post-exploitation tradecraft in one place: not forging anything, just enabling what was already issued.

Objective: Understand how Windows builds and enforces a per-process security context through the access token, how the Security Reference Monitor uses that token on every object access, and which token operations defenders need to see to catch impersonation, theft, and privilege enablement.

1. Why Tokens Exist

When you authenticate, LSASS (lsass.exe) creates a logon session, derives a primary access token from that session, and hands it to whatever process is being started for you — userinit.exe, then explorer.exe. From that point forward, every kernel object you touch — files, registry keys, named pipes, processes, threads — is evaluated against that token by the Security Reference Monitor (SRM).

The SRM lives in the kernel and does one job: when a thread asks for access to an object, compare the thread’s effective token to the object’s security descriptor and return a yes/no. That comparison happens in SeAccessCheck (kernel) and is surfaced to user mode as AccessCheck. The order matters — Integrity Level check → DACL check → Privilege check.

Without a token, the kernel has no answer to “who is this thread, and what is it allowed to do?” Tokens aren’t a wrapper around credentials. They are the runtime identity.

Flow diagram showing LSASS authentication creating a logon session, deriving a primary token, attaching it to a process, and the Security Reference Monitor performing SeAccessCheck in order: Integrity Level, DACL, Privilege. — From authentication to access decision: the primary token is the runtime identity the SRM consults on every object request.

2. Inside `nt!_TOKEN`

The kernel object is nt!_TOKEN. It’s undocumented — Microsoft exposes Win32 wrappers, not field layouts — but you can inspect it on your own build:

0: kd> dt nt!_TOKEN

The layout shifts between Windows versions, so never hardcode offsets. The fields that matter conceptually are stable:

Field	Purpose
`TokenId`	`LUID` uniquely identifying this token instance
`AuthenticationId`	`LUID` of the originating logon session
`TokenType`	`TokenPrimary` (1) or `TokenImpersonation` (2)
`ImpersonationLevel`	Only meaningful for impersonation tokens
`UserAndGroups`	Array of `SID_AND_ATTRIBUTES` — user SID plus group SIDs
`Privileges`	`SEP_TOKEN_PRIVILEGES` — three 64-bit privilege bitmasks
`IntegrityLevelIndex`	Index into `UserAndGroups` pointing at the mandatory label
`LogonSession`	Pointer to `SEP_LOGON_SESSION_REFERENCES`
`DefaultDacl`	DACL applied to objects this token creates
`SessionId`	RDP / Terminal Services session ID

The Privileges member is worth dwelling on. SEP_TOKEN_PRIVILEGES carries three 64-bit bitmasks — Present, Enabled, and EnabledByDefault — and that three-state design is the entire reason “privilege escalation” can be a one-API-call affair (covered in §6). This layout is community-observed via WinDbg and ReactOS source; treat it as undocumented and verify on your target build.

Hierarchy diagram of the nt!_TOKEN kernel structure, branching into Identity fields, Type and Impersonation Level, UserAndGroups SID array, SEP_TOKEN_PRIVILEGES with three bitmasks, Integrity Level index, and Logon Session pointer. — The nt!_TOKEN structure: the three-bitmask SEP_TOKEN_PRIVILEGES field (Present, Enabled, EnabledByDefault) is the mechanism behind most privilege-escalation tradecraft.

3. Primary vs. Impersonation Tokens

Every process has exactly one primary token, set at CreateProcess time and fixed for the lifetime of the process. You don’t swap it. To run code under a different identity, you start a new process with a different token (CreateProcessAsUser, CreateProcessWithTokenW).

Threads are different. A thread can carry an impersonation token that temporarily overrides the process’s primary token for that thread only. This is how RPC servers, named-pipe servers, and IIS worker threads handle requests on behalf of multiple callers without spawning a process each time. The kernel keeps it in _KTHREAD.ImpersonationInfo; SeAccessCheck prefers the thread token over the process token if one is present.

The distinction matters at detection time too. OpenProcessToken returns the primary token; OpenThreadToken returns the impersonation token, if any. A thread calling OpenThreadToken and getting ERROR_NO_TOKEN is normal — most threads aren’t impersonating. A thread calling it and getting SYSTEM is not.

Graph diagram contrasting a process primary token stored in _EPROCESS with a per-thread impersonation token stored in _KTHREAD.ImpersonationInfo, showing the SRM preferring the thread token when present. — The SRM always prefers a thread’s impersonation token over the process primary token, making per-thread identity the key primitive for RPC and pipe servers.

4. Integrity Levels and Mandatory Integrity Control

Mandatory Integrity Control (MIC) added a sideband label to the token and a corresponding mandatory label ACE in object SACLs. Five well-known integrity SIDs cover the practical range:

SID	Level	Typical Use
`S-1-16-0`	Untrusted	Heavily sandboxed code
`S-1-16-4096`	Low	Browser renderers, AppContainer
`S-1-16-8192`	Medium	Default for interactive user processes
`S-1-16-12288`	High	Elevated (post-UAC) admin processes
`S-1-16-16384`	System	`SYSTEM`-account services and kernel components

The label sits in UserAndGroups at index IntegrityLevelIndex, retrievable from user mode via GetTokenInformation(..., TokenIntegrityLevel, ...) into a TOKEN_MANDATORY_LABEL. MIC’s enforcement rule is simple: a process at a lower integrity level cannot write to or modify a higher-integrity object belonging to the same user — no DLL injection, no token impersonation up the chain. That single rule is what stops a Medium-IL Word process from injecting into a High-IL elevated PowerShell.

5. Reading a Token from User Mode

The minimum useful query: open the token, ask for the user SID, print it.

HANDLE hToken = NULL;
if (!OpenProcessToken(GetCurrentProcess(), TOKEN_QUERY, &hToken)) {
    return GetLastError();
}

DWORD cbUser = 0;
GetTokenInformation(hToken, TokenUser, NULL, 0, &cbUser);
PTOKEN_USER pUser = (PTOKEN_USER)LocalAlloc(LPTR, cbUser);

if (GetTokenInformation(hToken, TokenUser, pUser, cbUser, &cbUser)) {
    LPWSTR sidStr = NULL;
    ConvertSidToStringSidW(pUser->User.Sid, &sidStr);
    wprintf(L"User SID: %s\n", sidStr);
    LocalFree(sidStr);
}

LocalFree(pUser);
CloseHandle(hToken);

The same GetTokenInformation call with TokenGroups returns a TOKEN_GROUPS you can walk to see which groups are SE_GROUP_ENABLED, SE_GROUP_MANDATORY, or SE_GROUP_INTEGRITY (that last flag is how you find the IL label without parsing the index). TokenPrivileges returns a TOKEN_PRIVILEGES and feeds the next section.

For integrity level specifically:

DWORD cb = 0;
GetTokenInformation(hToken, TokenIntegrityLevel, NULL, 0, &cb);
PTOKEN_MANDATORY_LABEL pLabel = (PTOKEN_MANDATORY_LABEL)LocalAlloc(LPTR, cb);
GetTokenInformation(hToken, TokenIntegrityLevel, pLabel, cb, &cb);

DWORD rid = *GetSidSubAuthority(
    pLabel->Label.Sid,
    (DWORD)(UCHAR)(*GetSidSubAuthorityCount(pLabel->Label.Sid) - 1));

// rid == 0x2000 (8192)  -> Medium
// rid == 0x3000 (12288) -> High
// rid == 0x4000 (16384) -> System

6. Privileges: Present, Enabled, Removed

A privilege has three independent states inside the token:

Present — the privilege exists in the token. Cannot be added at runtime by user mode.
Enabled — the privilege is currently active for access checks.
Removed — once a privilege is removed via SE_PRIVILEGE_REMOVED, it’s gone for the life of the token.

AdjustTokenPrivileges only moves a privilege between “present and disabled” and “present and enabled.” It cannot grant a privilege the token never had. So when a tool “enables SeDebugPrivilege,” it isn’t gaining authority — that authority was issued at logon and waiting in the Present bitmask. The enable is purely a flag flip.

HANDLE hToken;
LUID  luid;
TOKEN_PRIVILEGES tp = {0};

OpenProcessToken(GetCurrentProcess(),
                 TOKEN_ADJUST_PRIVILEGES | TOKEN_QUERY,
                 &hToken);

LookupPrivilegeValueW(NULL, SE_DEBUG_NAME, &luid);

tp.PrivilegeCount           = 1;
tp.Privileges[0].Luid       = luid;
tp.Privileges[0].Attributes = SE_PRIVILEGE_ENABLED;

AdjustTokenPrivileges(hToken, FALSE, &tp, sizeof(tp), NULL, NULL);

if (GetLastError() == ERROR_NOT_ALL_ASSIGNED) {
    // Privilege wasn't Present in the token -> not actually enabled.
}

That ERROR_NOT_ALL_ASSIGNED check is the gotcha most first-timers miss: AdjustTokenPrivileges returns TRUE even when the privilege isn’t in Present. The real outcome is only visible through GetLastError. I’ve burned a solid afternoon staring at a “successful” call that did nothing because the calling process was unelevated and SeDebugPrivilege was never issued in the first place.

The privileges worth keeping at the top of a defender’s list:

Privilege	Why It Matters
`SeDebugPrivilege`	Open any process, including LSASS, for read/write
`SeImpersonatePrivilege`	Precondition for the Potato family of escalations
`SeAssignPrimaryTokenPrivilege`	Replace a process’s primary token
`SeTcbPrivilege`	“Act as part of the OS” — essentially unrestricted
`SeLoadDriverPrivilege`	Load arbitrary kernel drivers → BYOVD
`SeBackupPrivilege` / `SeRestorePrivilege`	Read/write any file regardless of DACL
`SeTakeOwnershipPrivilege`	Seize ownership of any object
`SeCreateTokenPrivilege`	Forge tokens directly — held only by `SYSTEM`

7. Impersonation in Depth

SECURITY_IMPERSONATION_LEVEL defines how far the impersonating thread can act on behalf of the original principal:

Level	Meaning
`SecurityAnonymous`	Server cannot identify or impersonate the client
`SecurityIdentification`	Server can identify but not act as the client
`SecurityImpersonation`	Server can act as the client on the local machine
`SecurityDelegation`	Server can act as the client on local and remote systems

The canonical sequence for a service impersonating a caller:

HANDLE hClient;
DuplicateTokenEx(hSourceToken,
                 TOKEN_ALL_ACCESS,
                 NULL,
                 SecurityImpersonation,
                 TokenImpersonation,
                 &hClient);

SetThreadToken(NULL, hClient);   // current thread now runs as the client
// ... perform the work that requires the client's identity ...
RevertToSelf();                  // back to the process's primary token
CloseHandle(hClient);

SECURITY_QUALITY_OF_SERVICE controls whether impersonation tracks the source statically or dynamically, and whether only the enabled privileges follow (EffectiveOnly). That last flag is one of the more interesting defensive levers — a service calling impersonation with EffectiveOnly = TRUE strips dormant privileges out of the impersonation context entirely.

8. Duplication, `LogonUser`, and Process Creation Under a Token

Three primitives cover most of the “run something as someone else” surface:

DuplicateTokenEx — clone an existing token, optionally upgrading from impersonation to primary type. Requires TOKEN_DUPLICATE on the source.
LogonUser — authenticate a username/password and receive a fresh primary token tied to a new logon session.
CreateProcessWithTokenW — start a new process whose primary token is the one you pass in. Requires SeImpersonatePrivilege on the caller.

The MITRE taxonomy splits the abuse cleanly along these primitives:

T1134.001 — Token Impersonation/Theft. OpenProcessToken against a higher-privileged process, DuplicateTokenEx, then ImpersonateLoggedOnUser or SetThreadToken. No credentials needed; you steal what’s already running.
T1134.002 — Create Process with Token. Same theft, but you go straight to CreateProcessWithTokenW to start a new process under the stolen identity rather than impersonating on a thread.
T1134.003 — Make and Impersonate Token. LogonUser with credentials in hand, then SetThreadToken. Quieter than theft because the resulting logon looks legitimate — but it generates a 4624 you can see.

Flow diagram mapping token abuse primitives: OpenProcessToken feeding DuplicateTokenEx which branches to thread impersonation (T1134.001) or CreateProcessWithTokenW (T1134.002), and LogonUser feeding SetThreadToken (T1134.003). — The three MITRE T1134 sub-techniques map directly onto three token API primitives — theft via duplication, new process under stolen token, or fresh token from explicit credentials.

9. `_EPROCESS.Token` and Kernel-Mode Abuse

The kernel’s view of a process’s primary token is the Token field in _EPROCESS, an EX_FAST_REF — a pointer with reference-count bits packed into the low bits. A kernel exploit with arbitrary write can overwrite that field with a pointer to the SYSTEM process’s token, instantly upgrading the attacker’s process to SYSTEM without touching any user-mode API.

Walking it in WinDbg looks like this:

0: kd> !process 0 0 explorer.exe
PROCESS ffffba0c1a5f6080 ...
0: kd> dt nt!_EPROCESS ffffba0c1a5f6080 Token
   +0x4b8 Token : _EX_FAST_REF
0: kd> dt nt!_TOKEN (poi(ffffba0c1a5f6080+0x4b8) & ~0xf)

The offset will not be 0x4b8 on your build. Use dt to find it on the system you’re analyzing.

For defenders, the operational takeaway is that kernel-mode token swapping leaves no user-mode footprint — no AdjustTokenPrivileges, no OpenProcessToken, no 4703. The detection has to shift earlier: catch the driver load (SeLoadDriverPrivilege use, signed-driver loader events) or the exploit’s user-mode loader, because by the time the swap happens your audit pipeline is blind to it.

10. Detection and Defense

Token abuse leaves observable traces across the Security log, Sysmon, and ETW. Pick the events that match the primitive you’re hunting.

Windows Security Audit Events

Event ID	Name	What It Tells You
`4624`	Successful logon	New logon session and primary token; check `LogonType`
`4648`	Logon with explicit credentials	`runas`, `CreateProcessWithLogonW`, lateral movement
`4672`	Special privileges assigned to new logon	Sensitive privileges granted at session start
`4673`	Privileged service called	Use of sensitive privilege
`4688`	New process created	Includes `TokenElevationType` (1/2/3)
`4703`	User right adjusted	`AdjustTokenPrivileges` calls — the core privilege-enable signal

4672 is high-value: it fires once per privileged logon and lists the sensitive privileges assigned. Filter out the well-known principals (LOCAL SYSTEM, NETWORK SERVICE, LOCAL SERVICE) and expected admins. What’s left is worth a look — that’s where Mimikatz-style pass-the-hash and elevation activity surfaces.

Sysmon

EID 1 (Process Create) — IntegrityLevel and User fields directly show the process’s effective token. A child of a Medium-IL process suddenly running at System integrity is a hard signal.
EID 10 (ProcessAccess) — OpenProcess against LSASS or other high-value targets. Watch GrantedAccess masks like 0x1400 (PROCESS_QUERY_INFORMATION | PROCESS_QUERY_LIMITED_INFORMATION) and 0x40 (PROCESS_DUP_HANDLE).
EID 8 (CreateRemoteThread) — cross-process injection that frequently follows token theft.

Sigma Sketch: Privilege Enable on a Sensitive Right

title: Sensitive Privilege Adjusted via AdjustTokenPrivileges
logsource:
  product: windows
  service: security
detection:
  selection:
    EventID: 4703
    EnabledPrivilegeList|contains:
      - 'SeDebugPrivilege'
      - 'SeImpersonatePrivilege'
      - 'SeTcbPrivilege'
      - 'SeLoadDriverPrivilege'
  filter_known:
    SubjectUserSid:
      - 'S-1-5-18'   # LOCAL SYSTEM
      - 'S-1-5-19'   # LOCAL SERVICE
      - 'S-1-5-20'   # NETWORK SERVICE
  condition: selection and not filter_known
level: high

To produce 4703, the Audit Token Right Adjusted subcategory has to be enabled — it isn’t by default on most builds. Same goes for Audit Sensitive Privilege Use for 4673/4674, and command-line logging in 4688 (Group Policy: System → Audit Process Creation → Include command line).

ETW Providers

Provider	What It Carries
`Microsoft-Windows-Security-Auditing`	All audit events above
`Microsoft-Windows-Kernel-Process`	Process/thread lifecycle including token assignment
`Microsoft-Windows-Threat-Intelligence`	High-fidelity process-access telemetry; PPL consumer only (Defender/EDR)

Hardening

SeCreateTokenPrivilege → SYSTEM only. Nothing else needs it.
SeAssignPrimaryTokenPrivilege → local/network service accounts only. Audit anything else holding it.
Strip SeImpersonatePrivilege from service accounts that don’t host RPC or named-pipe endpoints. Its presence is the precondition for the Potato family.
PPL for critical services — blocks OpenProcess with token-access rights from unprotected callers.
Credential Guard — isolates logon-session secrets in VSM,

References

SIDs and Security Descriptors: Identity in Windows Security

A thread opens a handle to a file. Before a single byte is read, the kernel has already answered a question nobody typed: is the caller’s identity allowed to do this? That answer lives at the intersection of two structures — the SID that names who you are, and the security descriptor that says who gets in. Get the relationship between them wrong and you ship a world-writable service. Understand it, and most “weird permission” incidents stop being mysterious.

Objective: Understand how Windows represents identity with Security Identifiers, how Security Descriptors bind owners, DACLs, and SACLs to every securable object, and how attackers abuse — and defenders detect — manipulation of both.

1. Identity Before Access

Windows authenticates security principals — anything the OS can prove an identity for: users, groups, computers, and service accounts. Authentication is the LSA’s job; the SAM (local) or the domain’s NTDS.dit (Active Directory) stores the account records. But authentication only proves who you are. Authorization — what you may touch — is a separate decision made against a different value: the SID.

A SID is the canonical, machine-readable name for a principal. Display names change. SAM account names get reused. SIDs do not. Once the system mints a SID at account-creation time, that value is never reused to identify another principal, even after the account is deleted. Every authorization check in the OS compares SIDs, never names.

2. Anatomy of a SID

A SID is a variable-length binary structure, defined as SID in winnt.h. Three logical parts: a revision, the issuing authority, and a chain of sub-authorities ending in a Relative Identifier (RID).

Field	Type	Meaning
`Revision`	`BYTE`	SID structure version — always `1`
`SubAuthorityCount`	`BYTE`	Number of sub-authority values (max 15)
`IdentifierAuthority`	`SID_IDENTIFIER_AUTHORITY`	6-byte top-level authority that issued the SID
`SubAuthority[]`	`DWORD[]`	Sub-authority values; the last element is the RID

The string notation everyone recognizes is just those fields, hyphenated. Take S-1-5-21-<d1>-<d2>-<d3>-513:

S-1 — a revision-1 SID.
5 — SECURITY_NT_AUTHORITY, marking it a Windows NT SID.
21 — SECURITY_NT_NON_UNIQUE, signaling that a domain identifier follows.
<d1>-<d2>-<d3> — three 32-bit values randomly generated to uniquely identify the domain.
513 — the RID; here, the well-known RID for Domain Users.

You rarely build SIDs by hand. You parse them. Here’s the field-level walk in C — note that the documented accessors (GetSidSubAuthority, GetSidIdentifierAuthority) return pointers into the structure, which trips up everyone the first time:

#include <windows.h>
#include <sddl.h>
#include <stdio.h>

void PrintSid(PSID pSid) {
    if (!IsValidSid(pSid)) return;

    PSID_IDENTIFIER_AUTHORITY pAuth = GetSidIdentifierAuthority(pSid);
    DWORD subCount = *GetSidSubAuthorityCount(pSid);

    printf("Authority: %u\n", (DWORD)pAuth->Value[5]); // NT authority lives in the low byte
    for (DWORD i = 0; i < subCount; i++)
        printf("  SubAuthority[%lu] = %lu\n", i, *GetSidSubAuthority(pSid, i));

    LPSTR str = NULL;
    if (ConvertSidToStringSidA(pSid, &str)) {       // -> "S-1-5-..."
        printf("String SID: %s\n", str);
        LocalFree(str);
    }
}

To go the other direction — constructing a known SID — use AllocateAndInitializeSid, which takes an authority plus up to eight sub-authorities. Building the SYSTEM SID (S-1-5-18) and comparing it with EqualSid is the idiomatic way to check “am I running as LocalSystem?”:

SID_IDENTIFIER_AUTHORITY ntAuth = SECURITY_NT_AUTHORITY; // {0,0,0,0,0,5}
PSID pSystem = NULL;

if (AllocateAndInitializeSid(&ntAuth, 1,
        SECURITY_LOCAL_SYSTEM_RID,   // 18
        0, 0, 0, 0, 0, 0, 0, &pSystem)) {
    // EqualSid(tokenSid, pSystem) -> TRUE means LocalSystem
    FreeSid(pSystem);                // never free this with LocalFree
}

3. Well-Known SIDs and Built-in Principals

Some SIDs are identical on every Windows install. Hard-coding their strings is a bug waiting to happen across locales and versions; use the documented constants where you can. Memorize the ones below anyway — you’ll read them in logs daily.

SID	Principal
`S-1-0-0`	Null SID (a group with no members)
`S-1-1-0`	Everyone
`S-1-5-18`	Local System
`S-1-5-19`	Local Service
`S-1-5-20`	Network Service
`S-1-5-32-544`	Builtin\Administrators
`S-1-16-12288`	High mandatory integrity level

Built-in accounts also carry well-known RIDs appended to the domain or machine SID: 500 is Administrator, 501 is Guest, 512 is Domain Admins. An attacker enumerating a domain looks for RID 500 and 512 specifically — the display name can be renamed, the RID cannot. Capability SIDs the OS recognizes are cached under HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\SecurityManager\CapabilityClasses\AllCachedCapabilities.

4. SIDs at Runtime: The Access Token

When a user signs in, LSA builds an access token for the session. That token is the runtime bag of identity: the user’s SID, the SIDs of every group the user belongs to, the privileges granted, and a mandatory integrity level SID (the S-1-16-* family). Every process started in that logon context inherits a copy. When code makes an access check, the kernel compares the SIDs in the token against the SIDs in the object’s DACL.

One detail that becomes an attack surface later: an account can carry extra SIDs in its Active Directory sIDHistory attribute. That attribute exists for legitimate domain migration — copy the old SID into sIDHistory so a migrated user keeps access to resources permissioned to the old account without re-ACLing everything. The catch is that all values in sIDHistory are injected into the access token at logon, exactly as if they were primary group memberships.

Flowchart showing how LSA mints an access token at logon, the token is inherited by processes, and the Security Reference Monitor compares token SIDs against an object DACL to produce a granted access mask — Every handle open flows through SeAccessCheck, which compares the caller’s token SIDs against the target object’s DACL top-to-bottom before returning a granted-access mask.

5. The Security Descriptor: Structure and Fields

Every object the Object Manager creates has a security descriptor. The structure is SECURITY_DESCRIPTOR, reproduced here verbatim from winnt.h:

typedef struct _SECURITY_DESCRIPTOR {
  BYTE                        Revision;
  BYTE                        Sbz1;
  SECURITY_DESCRIPTOR_CONTROL Control;
  PSID                        Owner;
  PSID                        Group;
  PACL                        Sacl;
  PACL                        Dacl;
} SECURITY_DESCRIPTOR, *PISECURITY_DESCRIPTOR;

Field by field: Revision is always 1; Sbz1 is reserved and must be zero; Control is a flag bitmask; Owner and Group point to SIDs; Dacl and Sacl point to access-control lists. The internal layout differs between absolute form (the struct holds pointers to separately allocated SIDs and ACLs) and self-relative form (everything packed into one contiguous blob with offsets, marked by SE_SELF_RELATIVE). Because that format varies, never poke fields directly — drive it through the API.

The Control field qualifies how the rest of the descriptor is interpreted:

Flag	Meaning
`SE_DACL_PRESENT`	The descriptor has a DACL (the pointer may still be NULL)
`SE_SACL_PRESENT`	The descriptor has a SACL
`SE_DACL_PROTECTED`	DACL is shielded from inherited ACEs
`SE_SACL_PROTECTED`	SACL is shielded from inherited ACEs
`SE_OWNER_DEFAULTED`	Owner was assigned by a default mechanism
`SE_SELF_RELATIVE`	Descriptor is in packed, self-relative form

Here is the single most important gotcha in this entire topic, and it has burned production systems repeatedly. There is a difference between no DACL, an empty DACL, and a NULL DACL:

SECURITY_DESCRIPTOR sd;
InitializeSecurityDescriptor(&sd, SECURITY_DESCRIPTOR_REVISION);

// NULL DACL: present == TRUE, pointer == NULL  -> GRANTS EVERYONE FULL ACCESS
SetSecurityDescriptorDacl(&sd, TRUE, NULL, FALSE);

// Empty DACL: present == TRUE, non-NULL ACL with zero ACEs -> DENIES EVERYONE
// (initialize an ACL with InitializeAcl and add no ACEs, then pass it here)

If SE_DACL_PRESENT is not set, or it is set with a NULL DACL pointer, the object allows full access to everyone. Developers reach for SetSecurityDescriptorDacl(&sd, TRUE, NULL, FALSE) thinking “no restrictions, default behavior” and ship a world-writable named pipe or service. An empty DACL — present, non-NULL, zero ACEs — does the opposite and denies everyone. One null pointer is the difference.

Hierarchy diagram of the SECURITY_DESCRIPTOR structure showing Owner SID, Group SID, DACL containing allow and deny ACEs, and SACL containing audit ACEs as child nodes — A security descriptor owns four pointers: two SIDs declaring ownership, a DACL controlling access, and a SACL controlling auditing — each ACE carries its own SID and access mask.

6. DACLs and ACEs: How Access Is Decided

A DACL is an ordered list of Access Control Entries. Each ACE has an ACE_HEADER (AceType, AceFlags, AceSize), an ACCESS_MASK of rights, and a trailing SID the entry applies to.

ACE Type	Used In	Effect
`ACCESS_ALLOWED_ACE`	DACL	Grants rights in its mask to the SID
`ACCESS_DENIED_ACE`	DACL	Denies rights in its mask to the SID
`SYSTEM_AUDIT_ACE`	SACL	Logs access matching its mask

Evaluation order matters: the kernel walks ACEs top to bottom and stops as soon as the requested access is fully granted or any of it is denied. Well-formed (canonical) DACLs place deny ACEs ahead of allow ACEs precisely so a deny is seen first. An ACL has no hard ACE-count limit, but the whole ACL must stay under 64 KB.

Reading a real object’s DACL means pulling the descriptor and iterating ACEs by index with GetAce:

PSECURITY_DESCRIPTOR pSD = NULL;
PSID  pOwner = NULL;
PACL  pDacl  = NULL;

DWORD rc = GetNamedSecurityInfoW(
    L"C:\\Windows\\System32\\config\\SAM", SE_FILE_OBJECT,
    OWNER_SECURITY_INFORMATION | DACL_SECURITY_INFORMATION,
    &pOwner, NULL, &pDacl, NULL, &pSD);

if (rc == ERROR_SUCCESS && pDacl) {
    for (WORD i = 0; i < pDacl->AceCount; i++) {
        PACE_HEADER hdr = NULL;
        if (GetAce(pDacl, i, (LPVOID*)&hdr)) {
            // hdr->AceType  == ACCESS_ALLOWED_ACE_TYPE / ACCESS_DENIED_ACE_TYPE
            // hdr->AceFlags == CONTAINER_INHERIT_ACE | OBJECT_INHERIT_ACE | ...
        }
    }
    LocalFree(pSD);
}

7. SACLs: Auditing Through the System ACL

The SACL uses the same ACL container but holds SYSTEM_AUDIT_ACE entries instead. Its access mask doesn’t grant or deny anything — it defines which access attempts generate audit records in the Windows Security Event Log. Reading or writing any object’s SACL requires the SeSecurityPrivilege right, which only Administrators normally hold. That privilege boundary is exactly why SACL tampering is a high-value detection target: the act of stripping audit ACEs is itself privileged.

8. SDDL: Security Descriptors as Text

A binary descriptor is awful to log, diff, or paste into a config file, so Windows defines the Security Descriptor Definition Language — a string form. The grammar is O: owner, G: group, D: DACL, S: SACL, each followed by flags and parenthesized ACEs:

O:BAG:SYD:(A;;FA;;;SY)(A;;FA;;;BA)(A;;0x1200a9;;;BU)S:(AU;SAFA;FA;;;WD)

That single ACE (A;;GRGWGX;;;SY) reads as: Allow, no inherit flags, Generic Read/Write/eXecute, to SY (SYSTEM). Round-trip it with ConvertSecurityDescriptorToStringSecurityDescriptor and ConvertStringSecurityDescriptorToSecurityDescriptor. In practice you’ll read SDDL far more often through PowerShell:

$acl = Get-Acl C:\Windows\System32\config\SAM
$acl.Owner            # owner principal
$acl.Sddl             # full SDDL string
$acl.Access | Format-Table IdentityReference, FileSystemRights, AccessControlType

icacls <path> gives the same data in a terser shorthand; Get-Acl is friendlier when you want the SDDL string itself for a baseline diff.

9. Inheritance and the Kernel Check

Child objects don’t usually carry hand-written ACLs. They inherit them. An ACE’s flags decide propagation: OBJECT_INHERIT_ACE (OI) pushes it onto leaf objects like files, CONTAINER_INHERIT_ACE (CI) onto sub-containers like folders or registry subkeys, and INHERIT_ONLY_ACE (IO) makes an ACE apply only to children and not the object carrying it. SE_DACL_PROTECTED blocks inheritance entirely — that’s what “disable inheritance” does in Explorer.

The decision itself happens in the kernel. Each OBJECT_HEADER carries a SecurityDescriptor field. At handle-creation time the Object Manager hands the token, the requested access, and the descriptor to the Security Reference Monitor (nt!SeAccessCheck), which walks the DACL and returns a granted-access mask. You can see the whole chain live in WinDbg:

kd> !process 0 0 lsass.exe
kd> !object <Object address>
kd> dt nt!_OBJECT_HEADER <header address> SecurityDescriptor
kd> !sd <SecurityDescriptor address & ~0xf>   ; mask low bits, they're flags
kd> !token                                     ; the token the check runs against

Files, registry keys, processes, threads, named pipes, services, jobs — anything named and securable runs through this same path.

10. Common Attacker Techniques

SIDs and SDs aren’t just plumbing — they’re a manipulation target for evasion and escalation. The primitives below all leave traces (covered next), which is the point of teaching them.

Technique	Description
NULL DACL planting	Set a present-but-NULL DACL on a service, registry key, or pipe to make it world-writable
DACL tampering for persistence	Add an explicit `ACCESS_ALLOWED_ACE` granting the attacker’s SID `FullControl` on a sensitive object
Owner abuse	Taking ownership of an object implicitly grants `WRITE_DAC`, letting an attacker rewrite the DACL afterward
SID-History injection	Write a privileged SID (e.g. a Domain Admins RID) into a controlled account’s `sIDHistory` so it lands in the token
SACL stripping	Remove audit ACEs from `lsass.exe`, `SAM`, or `ntds.dit` to suppress access logging before credential theft
Permission group discovery	Enumerate group SIDs and ACL members to plan lateral movement

A populated sIDHistory on a non-migrated account is the canonical hunting signal for the injection case:

Get-ADUser -Filter * -Properties sIDHistory |
    Where-Object { $_.sIDHistory } |
    Select-Object Name, @{ n='sIDHistory'; e={ $_.sIDHistory -join ', ' } }

In a domain with no active migration, any result here deserves investigation — especially a sIDHistory value ending in RID 512 or 519.

Graph diagram mapping four attacker techniques — SID-History Injection, NULL DACL Planting, DACL Tampering, and SACL Stripping — to their respective impacts: privileged token, world-writable object, persistent access, and audit blindspot — Each abuse primitive targets a distinct part of the SID/security-descriptor model and produces a different attacker capability, from silent credential theft to persistent object access.

11. Detection, Hunting, and Hardening

DACL and SACL changes are logged by Windows itself, not Sysmon — you must enable the right Advanced Audit Policy subcategories first (Object Access → Audit File System / Audit Registry, and Policy Change → Audit Audit Policy Change).

Event ID	Trigger	Hunt On
`4670`	Object permissions changed (DACL/Owner)	`ObjectName`, `OldSd`, `NewSd`, `SubjectUserSid`
`4907`	Object auditing (SACL) settings changed	Blank `NewSd` = SACL stripped
`4715`	Audit policy on an object changed	`OriginalSecurityDescriptor`, `NewSecurityDescriptor`
`4719`	System audit policy changed	`SubjectUserSid`, `AuditPolicyChanges`
`4663`	Object access attempt	Sudden gaps after a `4907` on LSASS = stripping
`4728`/`4732`/`4756`	Member added to privileged group	Correlate with SID manipulation

The highest-fidelity signal is a 4907 that blanks the SACL on lsass.exe, ntds.dit, or the SAM hive — that’s pre-credential-dump preparation. Pair it with Sysmon Event ID 10 (process access to LSASS) and Event ID 1 watching for icacls.exe, cacls.exe, sc.exe sdset, and Set-Acl command lines. A Sigma sketch for DACL tampering on sensitive objects:

title: Suspicious DACL Modification on Sensitive Object
logsource:
  product: windows
  service: security
detection:
  selection:
    EventID: 4670
    ObjectName|contains:
      - '\lsass.exe'
      - '\ntds.dit'
      - '\SAM'
  condition: selection
fields:
  - SubjectUserSid
  - ObjectName
  - OldSd
  - NewSd
level: high

Hardening, in rough priority order:

Hunt NULL DACLs. Use AccessChk to enumerate world-writable services, keys, and files; fix them.
Protect the LSASS SACL and alert on any 4907 that empties it.
Enable SID Filtering on every trust to neutralize cross-domain sIDHistory abuse, and audit sIDHistory on a schedule.
Restrict SeSecurityPrivilege to Administrators and watch for its use.
Prefer explicit DENY over absent ALLOW, and put privileged accounts in Protected Users.

MITRE ATT&CK Mapping

Technique	MITRE ID	Detection
Access Token Manipulation	`T1134`	Token/SID anomalies in logon events
SID-History Injection	`T1134.005`	Non-empty `sIDHistory` on non-migrated accounts
File/Directory Permissions Modification	`T1222.001`	`4670`; `icacls`/`SetNamedSecurityInfo` in `4688`
Impair Defenses: Disable/Modify Tools	`T1562.001`	`4907` blanking a SACL; `4663` gaps
Permission Groups Discovery	`T1069.001` / `.002`	Bulk SID/group enumeration

12. Tools

Tool	Description	Link
AccessChk	Dumps effective permissions and finds NULL/weak DACLs	learn.microsoft.com
`icacls`	Built-in ACL viewer/editor with SDDL shorthand	(built-in)
`Get-Acl` / `Set-Acl`	PowerShell SD read/write, exposes `.Sddl`	(built-in)
WinDbg	Kernel-side `!sd`, `!token`, `OBJECT_HEADER` inspection	learn.microsoft.com
Process Hacker	GUI view of token SIDs and object security	processhacker.sourceforge.io
WinObj	Browse Object Manager namespace and per-object security	learn.microsoft.com

Summary

A SID is the immutable, never-reused name Windows checks for every authorization decision — display names are cosmetic, SIDs are ground truth.
The access token carries the user SID plus all group SIDs (including any from sIDHistory), and the kernel compares those against an object’s DACL via nt!SeAccessCheck.
The SECURITY_DESCRIPTOR binds owner, group, DACL, and SACL; a present-but-NULL DACL silently grants everyone full access, while an empty DACL denies everyone.
SID-History injection (T1134.005) and SACL stripping (T1562.001) are the two abuse primitives worth hunting hardest — watch 4670, 4907, and non-empty sIDHistory.
Enable Object Access and Policy Change auditing, restrict SeSecurityPrivilege, enable SID Filtering on trusts, and baseline SDDL on sensitive objects so a tampered DACL stands out.

References

Fibers: User-Mode Cooperative Threads

Objective: Understand the internals of Windows fibers — how they relate to the TEB, the undocumented FIBER structure, Fiber Local Storage, and the cooperative context switch performed entirely in user mode — so defenders can recognize and detect adversarial use of fiber APIs for stealthy in-process execution.

1. Cooperative vs. Preemptive Scheduling

A thread is the Windows kernel’s unit of execution. The scheduler picks ready threads, slices CPU time, and preempts them at quantum boundaries — all driven from ntoskrnl.exe. A fiber is different: it is a unit of execution that the kernel does not know about. Fibers run inside threads, and the application — not the OS — chooses when one fiber yields and another runs.

Two consequences follow immediately:

A fiber switch never crosses the user/kernel boundary. No syscall is issued. SwitchToFiber lives in KernelBase.dll and returns without touching ntoskrnl.
From the kernel’s perspective, all activity performed by a fiber is attributed to the thread that runs it. Accessing TLS from a fiber accesses the thread’s TLS, not a per-fiber slot.

This is the root of both the elegance and the security relevance of fibers: they are coroutines built directly into the Win32 ABI, with stack pivots and register saves the kernel cannot see.

2. The Fiber Execution Model

A fiber consists of three things: a stack, a saved CPU context (registers, instruction pointer, SEH frame), and a start routine that receives an opaque parameter. A thread becomes “fiber-aware” by calling ConvertThreadToFiber, at which point that thread is permanently a fiber host until it calls ConvertFiberToThread.

Rule	Behavior
Must convert first	You cannot call `SwitchToFiber` from a thread until `ConvertThreadToFiber` runs.
Fiber function returning	If a fiber’s start routine returns, the host thread calls `ExitThread` and terminates.
Self-delete	If the currently running fiber calls `DeleteFiber` on itself, the host thread exits.
Cross-thread delete	Deleting a fiber that is the selected fiber of another thread will likely crash that thread — its stack just disappeared.
Cross-thread switch	`SwitchToFiber` accepts a fiber created by a different thread; the caller becomes the new host.

These rules are load-bearing — most fiber bugs (and several known abuse primitives) come from violating them.

3. TEB Layout and the FIBER Structure

The Thread Environment Block (TEB) tracks the per-thread fiber state. Three fields matter:

Field	Type	Role
`NtTib.FiberData`	`PVOID`	Pointer to the current fiber’s `FIBER` structure
`HasFiberData`	`USHORT : 1`	Bitfield set by `ConvertThreadToFiberEx`; indicates the thread hosts fibers
`FlsData`	`PVOID`	Pointer to the FLS slot array for the current fiber

ConvertThreadToFiberEx calls NtCurrentTeb(), checks Teb->HasFiberData, and if the thread is already a fiber returns with ERROR_ALREADY_FIBER. Otherwise it allocates a FIBER structure on the process heap via RtlAllocateHeap and stores its address in NtTib.FiberData.

The FIBER struct itself is not officially documented. The shape below is reconstructed from ReactOS sources and public symbols and is subject to change across Windows versions:

// Reconstructed from public symbols / ReactOS — illustrative only.
typedef struct _FIBER {
    PVOID    FiberData;          // lpParameter passed at creation
    PVOID    ExceptionList;      // Top of SEH chain (NT_TIB.ExceptionList)
    PVOID    StackBase;          // High end of the fiber stack
    PVOID    StackLimit;         // Low end (guard page)
    PVOID    DeallocationStack;  // Original VirtualAlloc base
    CONTEXT  FiberContext;       // Saved CPU state: RIP, RSP, RBP, RBX, ...
    ULONG    FiberFlags;         // FIBER_FLAG_FLOAT_SWITCH, etc.
    PVOID    ActivationContext;  // Per-fiber activation context stack
    PVOID    FlsSlots;           // Per-fiber FLS slot array
} FIBER, *PFIBER;

You must never read or write this structure directly. The Win32 fiber functions manage its contents; treating the returned LPVOID as opaque is part of the contract.

4. The Core Fiber API

The full surface is small. Most of winbase.h and fibersapi.h boils down to these functions:

Function	Purpose
`ConvertThreadToFiber`	Promote the calling thread into a fiber; required first
`ConvertThreadToFiberEx`	As above; accepts `FIBER_FLAG_FLOAT_SWITCH`
`CreateFiber`	Allocate stack + `FIBER` struct; record entry point and parameter
`CreateFiberEx`	As above; accepts `dwStackCommitSize` and flags
`SwitchToFiber`	Cooperative context switch to the supplied fiber
`DeleteFiber`	Free the fiber’s stack, context, and `FIBER` data
`ConvertFiberToThread`	Demote back to a plain thread; required to avoid leaks
`GetCurrentFiber`	Returns the current `FIBER` address (intrinsic — no `CALL`)
`GetFiberData`	Returns the `lpParameter` value (intrinsic — no `CALL`)

The exact CreateFiber signature, per MSDN:

LPVOID CreateFiber(
    SIZE_T                dwStackSize,    // 0 = default, grows up to 1 MB
    LPFIBER_START_ROUTINE lpStartAddress, // void StartRoutine(LPVOID lpParameter)
    LPVOID                lpParameter     // passed to the fiber function
);

GetCurrentFiber and GetFiberData are compiler intrinsics on MSVC — they inline directly to a gs:[0x20]/fs:[0x10] read of NtTib.FiberData. They produce no import thunk and no CALL instruction, which has direct consequences for IAT-based detection.

5. Fiber Lifecycle: A Minimal Example

This walks the canonical create → switch → yield → delete sequence. Note how g_mainFiber is the fiber identity of the original thread, returned by ConvertThreadToFiber.

#include <windows.h>
#include <stdio.h>

LPVOID g_mainFiber  = NULL;
LPVOID g_workFiber  = NULL;

VOID CALLBACK WorkerFiberProc(LPVOID lpParam) {
    printf("[worker] running on fiber %p, param=%p\n",
           GetCurrentFiber(), lpParam);

    // Cooperative yield — control returns to the main fiber.
    SwitchToFiber(g_mainFiber);

    printf("[worker] resumed; returning will ExitThread()\n");
    SwitchToFiber(g_mainFiber);   // never let the routine return
}

int main(void) {
    // Promote thread; TEB->HasFiberData becomes 1.
    g_mainFiber = ConvertThreadToFiber(NULL);

    // 64 KiB stack; entry = WorkerFiberProc; param = 0xDEADBEEF.
    g_workFiber = CreateFiber(0x10000, WorkerFiberProc, (LPVOID)0xDEADBEEF);

    SwitchToFiber(g_workFiber);   // first run of worker
    printf("[main] back from worker\n");
    SwitchToFiber(g_workFiber);   // resume worker

    DeleteFiber(g_workFiber);     // safe: not the running fiber
    ConvertFiberToThread();       // demote; release fiber bookkeeping
    return 0;
}

Forgetting ConvertFiberToThread leaks the main fiber’s FIBER allocation on the process heap. Forgetting to yield back before the worker returns terminates the host thread via ExitThread.

6. Context Switching Internals

SwitchToFiber is the heart of the API. Conceptually, it performs:

Save the current CPU state (RBX, RBP, RDI, RSI, R12–R15, RSP, RIP on x64) into the current fiber’s FiberContext.
Save the SEH chain head (NtTib.ExceptionList) and stack bounds (StackBase, StackLimit) into the current FIBER.
If FIBER_FLAG_FLOAT_SWITCH is set, save the XMM/MMX/x87 state.
Update NtTib.FiberData to point at the target FIBER.
Restore the target fiber’s stack bounds, SEH chain, FLS pointer, and CPU registers.
Return to the saved instruction pointer of the target — execution resumes there on the target’s stack.

Critically, this is a pure user-mode operation. No syscall, no int 2e, no ETW event from Microsoft-Windows-Kernel-Process. The host thread’s kernel-visible state (KTHREAD, ETHREAD) is unchanged; only RIP/RSP move from the kernel’s view.

; Conceptual sketch — SwitchToFiber x64 prologue
mov     gs:[0x20], rcx          ; NtTib.FiberData = target
mov     [rax + FiberContextOff + Rsp], rsp
mov     [rax + FiberContextOff + Rip], <return addr>
; ... restore target ...
mov     rsp, [rcx + FiberContextOff + Rsp]
jmp     qword [rcx + FiberContextOff + Rip]

Flow diagram showing the six steps of SwitchToFiber: saving registers, saving SEH and stack bounds, updating NtTib.FiberData, restoring target registers, and jumping to the target fiber's saved RIP — all in user mode with no syscall — SwitchToFiber completes an entire stack-and-register swap inside KernelBase.dll without issuing a single syscall or generating a kernel ETW event.

7. Fiber Local Storage (FLS)

TLS is per-thread. During a fiber switch the TEB’s TLS array is not swapped, so two fibers sharing a thread share TLS — a classic source of corruption when porting thread-based libraries to fibers. FLS solves this: it is per-fiber, and SwitchToFiber updates TEB->FlsData to the incoming fiber’s slot array.

Function	Purpose
`FlsAlloc(PFLS_CALLBACK_FUNCTION)`	Allocate an FLS index; optional destructor callback
`FlsSetValue(DWORD, PVOID)`	Store a per-fiber value at the given index
`FlsGetValue(DWORD)`	Read the current fiber’s value at the given index
`FlsFree(DWORD)`	Release the index; callbacks fire for live fibers

The destructor callback pointers are kept process-wide in PEB->FlsCallback. They fire on fiber deletion and thread exit, and — as covered below — they are a known abuse target.

DWORD g_flsIndex;

VOID WINAPI OnFlsDestroy(PVOID p) {
    HeapFree(GetProcessHeap(), 0, p);
}

VOID CALLBACK FiberA(LPVOID _) {
    char *buf = (char*)HeapAlloc(GetProcessHeap(), 0, 32);
    lstrcpyA(buf, "fiber-A-private");
    FlsSetValue(g_flsIndex, buf);
    SwitchToFiber(g_mainFiber);
    printf("[A] still mine: %s\n", (char*)FlsGetValue(g_flsIndex));
    SwitchToFiber(g_mainFiber);
}

int wmain(void) {
    g_mainFiber = ConvertThreadToFiber(NULL);
    g_flsIndex  = FlsAlloc(OnFlsDestroy);
    // ... create FiberA, FiberB, switch between them ...
    // Each fiber sees its own FlsGetValue(g_flsIndex) result.
}

Hierarchy diagram showing how PEB holds FlsCallback destructor pointers, TEB holds NtTib.FiberData pointing to the FIBER structure and FlsData pointing to the per-fiber FLS slot array, with the destructor relationship between PEB FlsCallback and the slot array — FLS slot arrays are swapped per-fiber on every SwitchToFiber call, while PEB→FlsCallback holds process-wide destructor pointers that fire on fiber deletion — a known adversarial overwrite target.

8. Building a Round-Robin Cooperative Scheduler

Fibers shine when modeling cooperative pipelines: parsers, generators, state machines. A trivial scheduler is a dispatcher fiber that round-robins through worker fibers, each of which yields back via SwitchToFiber(g_mainFiber).

#define N 3
LPVOID g_workers[N];
LPVOID g_mainFiber;

VOID CALLBACK Worker(LPVOID id) {
    for (int i = 0; i < 4; ++i) {
        printf("[worker %llu] step %d\n", (ULONG_PTR)id, i);
        SwitchToFiber(g_mainFiber);   // yield
    }
    // Final yield — never return from a fiber routine.
    SwitchToFiber(g_mainFiber);
}

int main(void) {
    g_mainFiber = ConvertThreadToFiber(NULL);
    for (ULONG_PTR i = 0; i < N; ++i)
        g_workers[i] = CreateFiber(0, Worker, (LPVOID)i);

    for (int round = 0; round < 4; ++round)
        for (int i = 0; i < N; ++i)
            SwitchToFiber(g_workers[i]);

    for (int i = 0; i < N; ++i) DeleteFiber(g_workers[i]);
    ConvertFiberToThread();
    return 0;
}

This is the same pattern Microsoft SQL Server used for its historical “lightweight pooling” / fiber mode — one OS thread, many SQL user contexts.

9. Legitimate Use Cases and Pitfalls

Use Case	Reason
Coroutines / generators	Native stack switching with no `setjmp` tricks
Porting cooperative legacy code	UNIX `swapcontext`-style schedulers map cleanly
Database engines	SQL Server fiber mode for high-concurrency workloads
Game engines / scripting hosts	Per-script execution context with explicit yield

Pitfalls are sharp:

COM is apartment-affinitive to threads, not fibers. Initializing COM on one fiber and using it from another corrupts COM bookkeeping.
CRT and many MS libraries stash state in TLS. Switching fibers leaves that state behind, producing subtle corruption.
Critical sections record the thread as the owner — a different fiber on the same thread re-enters without blocking.
Stack-cookies and __try/__except rely on SEH chain integrity; SwitchToFiber handles this, but raw RtlInstallFunctionTableCallback on a fiber stack must use the fiber’s StackBase/StackLimit.

10. Common Attacker Techniques

Fibers are attractive to adversaries because the entire execution primitive lives in user mode — no NtCreateThread, no CreateRemoteThread, no kernel ETW event for the act of switching execution. The patterns below are documented in public threat-research literature; described conceptually here for detection engineers.

Technique	Description
In-process shellcode via `SwitchToFiber`	Allocate `PAGE_EXECUTE_READWRITE` memory, copy a payload, call `ConvertThreadToFiber` then `CreateFiber` with the payload as `lpStartAddress`, then `SwitchToFiber` — execution begins with no new thread
Fiber-based ROP staging	A fiber’s saved `CONTEXT` includes `RIP` and `RSP`; manipulating a `FIBER` struct’s context fields lets an attacker pivot the stack on `SwitchToFiber`
`PEB->FlsCallback` overwrite	Overwrite an entry in the process-wide FLS callback array; on the next `FlsFree` or fiber/thread teardown the attacker-controlled pointer is invoked with attacker-controlled data
TLS evasion via FLS	Hide per-task state in FLS slots that defensive tooling enumerating TLS will miss
API hiding via intrinsics	`GetCurrentFiber`/`GetFiberData` produce no IAT entry; static analysis missing `gs:[0x20]` reads will not see fiber-aware code

The base ATT&CK parent for fiber-based in-process execution is T1055 Process Injection; MITRE has not assigned a fiber-specific sub-technique, so the closest analogue is T1055.004 (APC) which shares the “queue execution to a thread’s user-mode context” model.

11. Defensive Strategies & Detection

There is no kernel event for SwitchToFiber. Detection must focus on the setup that precedes fiber-based execution (RWX allocation, suspicious entry points) and on memory forensics of fiber state at rest.

Sysmon coverage for the surrounding behavior:

Event ID	Signal
`1`	Process Create — establish baseline lineage
`8`	`CreateRemoteThread` — co-occurs with cross-process fiber staging
`10`	`ProcessAccess` — reflective loaders reading remote memory before fiber dispatch
`17`/`18`	Named-pipe create/connect — common multi-stage loader IPC
`25`	`ProcessTampering` — image-region tampering in a fiber host

ETW providers worth subscribing:

Microsoft-Windows-Threat-Intelligence — flags VirtualAlloc/VirtualProtect with PAGE_EXECUTE_*, the precursor to fiber shellcode staging.
Microsoft-Windows-Kernel-Process — does not see fiber switches but covers process/thread lifecycle.
A user-mode consumer hooking NtAllocateVirtualMemory + NtProtectVirtualMemory gives the strongest pre-execution signal.

Memory forensics indicators:

Walk TEB.NtTib.FiberData on every thread. Threads with HasFiberData == 1 in processes that have no business using fibers are immediately interesting.
Use Volatility malfind to surface private, executable, non-image-backed pages — the target of a fiber-staged payload.
Dump PEB->FlsCallback and verify every entry resolves to an expected module’s .text section.

Sigma sketch for the cross-process precursor to fiber-based payload staging:

title: Suspicious ProcessAccess Preceding User-Mode Fiber Execution
id: 8f5c1d6e-3c7b-4b1f-9e1e-7e3e6e2b0a1f
logsource:
  product: windows
  service: sysmon
detection:
  selection:
    EventID: 10
    GrantedAccess:
      - '0x1fffff'   # PROCESS_ALL_ACCESS
      - '0x1f0fff'
    TargetImage|endswith:
      - '\explorer.exe'
      - '\svchost.exe'
  filter_legit:
    SourceImage|endswith:
      - '\MsMpEng.exe'
      - '\SenseIR.exe'
  condition: selection and not filter_legit
level: high
tags:
  - attack.t1055
  - attack.t1106

Hardening:

SetProcessMitigationPolicy with ProcessDynamicCodePolicy (Arbitrary Code Guard) blocks creation of new executable pages, defeating fiber shellcode staging.
Control Flow Guard restricts indirect-call targets, narrowing SwitchToFiber and FLS-callback abuse to valid entry points.
HVCI / memory integrity prevents kernel-side tampering of FIBER structures via vulnerable drivers.
WDAC / AppLocker policies that deny PAGE_EXECUTE_* allocations on non-JIT processes raise the cost of any in-process execution primitive.

Graph diagram mapping fiber abuse detection signals: RWX allocation feeding ETW Threat-Intelligence provider and Sysmon events, memory forensics walking PEB FlsCallback for non-text-section pointers, and ACG/CFG/HVCI as hardening mitigations — Because SwitchToFiber produces no kernel telemetry, defenders must pivot to pre-execution signals like RWX allocations, memory forensics on FiberData and FlsCallback, and ACG to deny executable page creation entirely.

12. Tools for Fiber Analysis

Tool	Description	Link
WinDbg	Dump `TEB`, walk `NtTib.FiberData`, inspect `FIBER.FiberContext`	`microsoft.com`
Process Hacker	Enumerate threads, inspect TEB, examine private RWX regions	`processhacker.sf.io`
Process Monitor	Capture `VirtualAlloc`/`VirtualProtect` sequences preceding fiber dispatch	`sysinternals.com`
Volatility 3	`windows.malfind`, TEB plugins, FLS callback inspection	`volatilityfoundation.org`
pykd / WinDbg JS	Scripted walks of `FIBER` chains across all threads	`githomelab.ru/pykd`
x64dbg	User-mode debugging of fiber-aware binaries; trace `gs:[0x20]` reads	`x64dbg.com`
Ghidra	Static analysis; recognize `GetCurrentFiber` intrinsic pattern	`ghidra-sre.org`
Sysmon	Surrounding telemetry (Events `1`, `8`, `10`, `25`)	`sysinternals.com`

A minimal WinDbg recipe to surface fiber-hosting threads in a captured process:

0:000> !teb
TEB at 000000abcd123000
    ...
    NtTib.FiberData:  0000020fabcde000
    ...
0:000> dt ntdll!_TEB @$teb HasFiberData
0:000> dq 0000020fabcde000 L40   ; raw FIBER bytes — layout version-dependent

13. MITRE ATT&CK Mapping

Technique	MITRE ID	Detection
Process Injection	`T1055`	Memory scan for private RWX regions; ETW TI on `NtAllocateVirtualMemory`
Process Injection: Asynchronous Procedure Call	`T1055.004`	Closest published sub-technique to fiber-based in-process execution
Native API	`T1106`	API-call auditing of `CreateFiber`/`SwitchToFiber`/`FlsAlloc`
Reflective Code Loading	`T1620`	Image-load anomalies; fiber entry point in non-image-backed memory
Impair Defenses: Disable or Modify Tools	`T1562.001`	ETW/AMSI hook integrity checks; user-mode hook auditing

MITRE ATT&CK does not currently list a “Fiber Injection” sub-technique (current as of v16.1). Vendor research treats fiber-based execution as a variant of T1055; map accordingly.

Summary

A fiber is a user-mode cooperative thread invisible to the kernel scheduler — SwitchToFiber performs a stack and register swap entirely in KernelBase.dll with no syscall.
The TEB exposes the fiber state via NtTib.FiberData, HasFiberData, and FlsData; the FIBER structure itself is undocumented and version-dependent.
TLS is per-thread and is not swapped on a fiber switch; FLS is per-fiber and is swapped, with destructor callbacks tracked in PEB->FlsCallback.
Adversaries abuse fibers for in-process shellcode execution, ROP staging via the saved CONTEXT, and code execution via PEB->FlsCallback overwrites — none of which trigger thread-creation telemetry.
Detect via pre-execution signals (ETW TI on RWX allocations, Sysmon Event IDs 8/10/25), memory forensics on private executable regions and FlsCallback integrity, and hardening with ACG, CFG, and HVCI.

References

Jobs and Silos: Process Grouping and Resource Limits

Objective: Understand how the Windows kernel uses Job Objects and Silo Objects to group processes, enforce CPU/memory/network limits, and provide the namespace isolation that underpins Windows containers — and how defenders detect and harden against their abuse.

1. What Is a Job Object?

A job object lets a group of processes be managed as a single unit. It is a namable, securable, sharable kernel object that controls attributes of every process associated with it; operations on the job — limits, termination, accounting — apply to all member processes at once.

In the kernel the object is the undocumented executive type EJOB, allocated from kernel pool. Each process control block carries an EPROCESS.Job pointer linking it to its owning job. User mode never touches EJOB directly; it operates through a handle returned by CreateJobObject.

Before Windows 8 / Windows Server 2012, a process could belong to one job and jobs could not be nested. Windows 8 introduced nested jobs, allowing a process to participate in a hierarchy where the effective limit is the most restrictive ancestor.

Object Type	Description
`EJOB`	Kernel job object; groups processes, holds limits and accounting
`EPROCESS.Job`	Per-process pointer to its owning job
Named job	Job published under `\Sessions\<N>\BaseNamedObjects\`, openable by name
Anonymous job	Handle-only job, no namespace entry, shared by duplication/inheritance

Hierarchy diagram showing a user-mode handle referencing the kernel EJOB object, which links to three EPROCESS member processes via Job pointers — A single EJOB kernel object anchors all member processes; user mode accesses it only through an opaque handle.

2. Core Job Object APIs

The job lifecycle is driven by a small, stable Win32 surface.

Function	Purpose
`CreateJobObject`	Create, or open if named, a job object
`OpenJobObject`	Open an existing named job
`AssignProcessToJobObject`	Add a process to a job
`SetInformationJobObject`	Apply limits and policy to the job
`QueryInformationJobObject`	Read limits, accounting, and peak usage
`TerminateJobObject`	Kill every process in the job
`IsProcessInJob`	Test whether a process already belongs to a job

HANDLE CreateJobObject(LPSECURITY_ATTRIBUTES lpJobAttributes, LPCWSTR lpName);
BOOL   AssignProcessToJobObject(HANDLE hJob, HANDLE hProcess);
BOOL   SetInformationJobObject(HANDLE hJob, JOBOBJECTINFOCLASS JobObjectInformationClass,
                               LPVOID lpJobObjectInformation, DWORD cbJobObjectInformationLength);
BOOL   QueryInformationJobObject(HANDLE hJob, JOBOBJECTINFOCLASS JobObjectInformationClass,
                                 LPVOID lpJobObjectInformation, DWORD cbJobObjectInformationLength,
                                 LPDWORD lpReturnLength);
BOOL   TerminateJobObject(HANDLE hJob, UINT uExitCode);

3. Basic Limits: CPU, Memory, and Process Count

JOBOBJECT_BASIC_LIMIT_INFORMATION carries the foundational controls.

typedef struct _JOBOBJECT_BASIC_LIMIT_INFORMATION {
  LARGE_INTEGER PerProcessUserTimeLimit;
  LARGE_INTEGER PerJobUserTimeLimit;
  DWORD         LimitFlags;
  SIZE_T        MinimumWorkingSetSize;
  SIZE_T        MaximumWorkingSetSize;
  DWORD         ActiveProcessLimit;
  ULONG_PTR     Affinity;
  DWORD         PriorityClass;
  DWORD         SchedulingClass;
} JOBOBJECT_BASIC_LIMIT_INFORMATION;

The LimitFlags bitmask selects which fields the kernel enforces.

Limit Flag	Description
`JOB_OBJECT_LIMIT_PROCESS_TIME`	Per-process user-mode CPU cap (100 ns ticks); process killed when exceeded
`JOB_OBJECT_LIMIT_JOB_TIME`	Job-wide CPU time cap
`JOB_OBJECT_LIMIT_WORKINGSET`	Min/max working set per process
`JOB_OBJECT_LIMIT_ACTIVE_PROCESS`	Caps active process count; over-limit assignment terminates the process
`JOB_OBJECT_LIMIT_AFFINITY`	Forces a processor affinity mask
`JOB_OBJECT_LIMIT_KILL_ON_JOB_CLOSE`	Kills all processes when the last job handle closes

JOB_OBJECT_LIMIT_KILL_ON_JOB_CLOSE is the cornerstone of any sandbox: if the controlling process dies, the entire tree is reaped, leaving no orphaned children.

#include <windows.h>

int main(void) {
    HANDLE hJob = CreateJobObject(NULL, L"Sandbox_Demo");   // named for observability
    if (!hJob) return GetLastError();

    JOBOBJECT_EXTENDED_LIMIT_INFORMATION eli = { 0 };
    eli.BasicLimitInformation.LimitFlags =
        JOB_OBJECT_LIMIT_KILL_ON_JOB_CLOSE |   // tear down tree on handle loss
        JOB_OBJECT_LIMIT_ACTIVE_PROCESS;       // bound process count
    eli.BasicLimitInformation.ActiveProcessLimit = 4;
    SetInformationJobObject(hJob, JobObjectExtendedLimitInformation, &eli, sizeof(eli));

    STARTUPINFO si = { sizeof(si) };
    PROCESS_INFORMATION pi = { 0 };
    // Create suspended so we can assign before any code runs
    CreateProcess(L"C:\\Windows\\System32\\notepad.exe", NULL, NULL, NULL,
                  FALSE, CREATE_SUSPENDED, NULL, NULL, &si, &pi);

    AssignProcessToJobObject(hJob, pi.hProcess);
    ResumeThread(pi.hThread);

    CloseHandle(pi.hThread);
    CloseHandle(pi.hProcess);
    CloseHandle(hJob);   // KILL_ON_JOB_CLOSE terminates notepad here
    return 0;
}

4. Extended and Rate Limits

JOBOBJECT_EXTENDED_LIMIT_INFORMATION embeds the basic structure as BasicLimitInformation and adds memory governance: ProcessMemoryLimit (per-process commit, needs JOB_OBJECT_LIMIT_PROCESS_MEMORY), JobMemoryLimit (job-wide commit, needs JOB_OBJECT_LIMIT_JOB_MEMORY), and the continuously tracked PeakProcessMemoryUsed / PeakJobMemoryUsed. The two memory limits are independent — a 100 MB job-wide cap can coexist with a 10 MB per-process cap.

JOBOBJECT_EXTENDED_LIMIT_INFORMATION eli = { 0 };
eli.BasicLimitInformation.LimitFlags =
    JOB_OBJECT_LIMIT_PROCESS_MEMORY | JOB_OBJECT_LIMIT_JOB_MEMORY;
eli.ProcessMemoryLimit = 10  * 1024 * 1024;   // 10 MB per process
eli.JobMemoryLimit     = 100 * 1024 * 1024;   // 100 MB job-wide (independent)
SetInformationJobObject(hJob, JobObjectExtendedLimitInformation, &eli, sizeof(eli));

DWORD ret = 0;
QueryInformationJobObject(hJob, JobObjectExtendedLimitInformation, &eli, sizeof(eli), &ret);
printf("PeakJobMemoryUsed: %zu bytes\n", eli.PeakJobMemoryUsed);

CPU throttling uses JOBOBJECT_CPU_RATE_CONTROL_INFORMATION.

typedef struct _JOBOBJECT_CPU_RATE_CONTROL_INFORMATION {
  DWORD ControlFlags;
  union {
    DWORD CpuRate;
    DWORD Weight;
    struct { WORD MinRate; WORD MaxRate; } DUMMYSTRUCTNAME;
  } DUMMYUNIONNAME;
} JOBOBJECT_CPU_RATE_CONTROL_INFORMATION;

Control Flag	Value	Behaviour
`JOB_OBJECT_CPU_RATE_CONTROL_ENABLE`	`0x1`	Enables CPU rate control
`JOB_OBJECT_CPU_RATE_CONTROL_WEIGHT_BASED`	`0x2`	Rate derived from relative weight vs. other jobs
`JOB_OBJECT_CPU_RATE_CONTROL_HARD_CAP`	`0x4`	Hard cap; no job threads run after the budget is spent until next interval
`JOB_OBJECT_CPU_RATE_CONTROL_NOTIFY`	`0x8`	Notifies when the rate limit is exceeded

JOBOBJECT_CPU_RATE_CONTROL_INFORMATION cpu = { 0 };
cpu.ControlFlags = JOB_OBJECT_CPU_RATE_CONTROL_ENABLE |
                   JOB_OBJECT_CPU_RATE_CONTROL_HARD_CAP;
cpu.CpuRate = 2000;   // 20.00% of one CPU (units of 1/100 percent)

// Windows containers (non-Hyper-V) use weight-based control instead:
// cpu.ControlFlags = JOB_OBJECT_CPU_RATE_CONTROL_ENABLE |
//                    JOB_OBJECT_CPU_RATE_CONTROL_WEIGHT_BASED;
// cpu.Weight = 5;    // relative scheduling weight

SetInformationJobObject(hJob, JobObjectCpuRateControlInformation, &cpu, sizeof(cpu));

Network bandwidth is bounded with JOBOBJECT_NET_RATE_CONTROL_INFORMATION, which sets MaxBandwidth (outgoing bytes), a DscpTag, and ControlFlags for scheduling policy.

5. Notification Limits and I/O Completion Ports

Not every limit should kill. JOBOBJECT_NOTIFICATION_LIMIT_INFORMATION defines soft limits that alert without termination, covering IoReadBytesLimit, IoWriteBytesLimit, per-job user time, and job memory. To receive these alerts, associate an I/O completion port via JOBOBJECT_ASSOCIATE_COMPLETION_PORT.

Completion Message	Meaning
`JOB_OBJECT_MSG_NEW_PROCESS`	A process was added to the job
`JOB_OBJECT_MSG_EXIT_PROCESS`	A member process exited
`JOB_OBJECT_MSG_ACTIVE_PROCESS_ZERO`	Job is now empty
`JOB_OBJECT_MSG_JOB_MEMORY_LIMIT`	Job-wide commit limit was hit

HANDLE hPort = CreateIoCompletionPort(INVALID_HANDLE_VALUE, NULL, 0, 1);

JOBOBJECT_ASSOCIATE_COMPLETION_PORT acp = { 0 };
acp.CompletionKey  = hJob;     // echoed back as the key
acp.CompletionPort = hPort;
SetInformationJobObject(hJob, JobObjectAssociateCompletionPortInformation, &acp, sizeof(acp));

DWORD msg; ULONG_PTR key; LPOVERLAPPED ov;
while (GetQueuedCompletionStatus(hPort, &msg, &key, &ov, INFINITE)) {
    switch (msg) {
        case JOB_OBJECT_MSG_NEW_PROCESS:         /* child started   */ break;
        case JOB_OBJECT_MSG_JOB_MEMORY_LIMIT:    /* commit cap hit   */ break;
        case JOB_OBJECT_MSG_ACTIVE_PROCESS_ZERO: return 0;  // job empty
    }
}

6. Nested Jobs

On Windows 8 and later, assigning an already-jobbed process to a second job nests it. The kernel computes the effective limit as the minimum of the chain — a child job can only tighten, never loosen, an ancestor’s constraint.

// Parent job: 200 MB job-wide commit
HANDLE hParent = CreateJobObject(NULL, NULL);
JOBOBJECT_EXTENDED_LIMIT_INFORMATION p = { 0 };
p.BasicLimitInformation.LimitFlags = JOB_OBJECT_LIMIT_JOB_MEMORY;
p.JobMemoryLimit = 200 * 1024 * 1024;
SetInformationJobObject(hParent, JobObjectExtendedLimitInformation, &p, sizeof(p));
AssignProcessToJobObject(hParent, hProc);

// Child job nested under parent: 100 MB
HANDLE hChild = CreateJobObject(NULL, NULL);
JOBOBJECT_EXTENDED_LIMIT_INFORMATION c = { 0 };
c.BasicLimitInformation.LimitFlags = JOB_OBJECT_LIMIT_JOB_MEMORY;
c.JobMemoryLimit = 100 * 1024 * 1024;
SetInformationJobObject(hChild, JobObjectExtendedLimitInformation, &c, sizeof(c));
AssignProcessToJobObject(hChild, hProc);   // Win8+ nests automatically

// Effective limit on hProc = min(200 MB, 100 MB) = 100 MB

For pre-Windows 8 compatibility, test membership first — assigning a jobbed process there is fatal.

BOOL inJob = FALSE;
IsProcessInJob(hProc, NULL, &inJob);   // NULL JobHandle = "any job"
if (inJob) {
    // Windows 7: cannot reassign (no nesting). Windows 8+: assignment nests.
}
AssignProcessToJobObject(hJob, hProc);

Hierarchy diagram illustrating how the kernel computes the effective limit as the minimum across a nested job chain before applying it to a member process — Nested jobs only tighten constraints — the kernel enforces the most restrictive ancestor limit at every level.

7. Inspecting Jobs at Runtime

Process Explorer and Process Hacker display a process’s job membership and its limits on a dedicated Job tab. WinObj reveals named job objects in the Object Manager namespace. In kernel debugging, walk and dump jobs directly.

0: kd> !process 0 0 notepad.exe          ; find the EPROCESS
0: kd> dt nt!_EPROCESS Job <EPROCESS>    ; read the Job pointer
0: kd> !job <EJOB-address>               ; dump limits and member list
0: kd> dt nt!_EJOB JobFlags              ; locate the silo/flags field

These are observation tools, not attack tooling — they let an analyst confirm exactly which processes share a job and what limits are in force.

8. Silos: From Jobs to Containers

Jobs alone do not isolate the namespace — they constrain resources but not what a process can name or see. Microsoft solved this with silos, effectively “super jobs.” A silo is a job object with the Silo flag set in the EJOB.JobFlags field.

There are two silo types:

Silo Type	Use	Privilege
Application silo	Desktop Bridge / MSIX app isolation	Standard
Server silo	Windows (Docker) container support	Administrator

When a silo is created, the kernel builds it its own root directory object, distinct from the host root — giving the silo a private object namespace. A server silo further owns an _ESERVERSILO_GLOBALS structure holding container-specific state, and is backed by a virtual disk, a registry hive, and a virtual network adapter.

Kernel Function	Purpose
`PsCreateSilo` / `PsCreateServerSilo`	Create silo / server silo objects
`PsAttachSiloToCurrentThread` / `PsDetachSiloFromCurrentThread`	Bind/unbind a thread to a silo context
`PsGetThreadServerSilo`	Return the server silo a thread runs in
`PsIsCurrentThreadInServerSilo`	Boolean gate used to restrict syscalls inside a container

; For understanding only — JobFlags layout is build-specific and undocumented.
0: kd> dt nt!_EJOB JobFlags
   +0x0?? JobFlags : Uint4B    ; a bit in this field marks the job as a silo

The _EJOB, _ESERVERSILO_GLOBALS, and JobFlags offsets are undocumented and shift between OS builds. Validate them against your target build with WinDbg dt before treating any offset as authoritative.

Hierarchy diagram showing the progression from a plain Job Object to a Silo with a private namespace, and further to a Server Silo owning container-specific state including registry hive and virtual network adapter — Silos extend job objects with namespace isolation; server silos layer on full container state to back Windows Server containers.

9. Windows Containers and the Host Compute Service

Windows Server containers are built on server silos. The Host Compute Service (HCS) orchestrates their lifecycle, wiring up the silo’s job-object resource controls, registry hive virtualization, and filesystem isolation. The filesystem layer is enforced by wcifs.sys, the Windows Container Isolation Filter Driver, which projects the container’s view over the host volume.

Mode	Boundary	Notes
`--isolation=process`	Server silo, shared host kernel	Lighter, but escapes reach the host kernel
`--isolation=hyperv`	Utility VM + inner job object	VM enforces limits even if the inner job is escaped

Process isolation shares the host kernel, which makes server-silo escape research directly relevant to defenders. Hyper-V isolation applies controls at both the VM and the inner container job object — a job escape still cannot exceed VM-level limits.

Flow diagram showing the Host Compute Service orchestrating a Server Silo, which interacts with the wcifs.sys isolation filter driver, with an optional Hyper-V VM layer applying additional limits — The HCS wires together the server silo, wcifs.sys filesystem filter, and optional Hyper-V VM boundary to form a complete Windows container stack.

10. Common Attacker Techniques

Technique	Description
Sandbox-aware keying	Payload detects a constrained job (low `ActiveProcessLimit`, tight memory cap) and alters behaviour to evade analysis
Debugger / UI blocking	Setting `JOB_OBJECT_UILIMIT_HANDLES` or `JOB_OBJECT_UILIMIT_EXITWINDOWS` to deny security-tool UI/handle access within the job
Breakaway abuse	Using `JOB_OBJECT_LIMIT_BREAKAWAY_OK` so child processes escape a controlling job’s limits and accounting
Child-tree concealment	Wrapping persistent processes in a job to manage and hide their descendant trees
Container / silo escape	Breaking out of a server silo’s namespace root to reach the host OS

Adversaries also use the native API directly — CreateJobObject, AssignProcessToJobObject, SetInformationJobObject — to construct their own sandboxes around tooling, or to apply quotas that frustrate dynamic analysis.

11. Defensive Strategies & Detection

There is no dedicated Sysmon event for CreateJobObject or AssignProcessToJobObject as of Sysmon v15 — job manipulation is caught indirectly via process access, process creation, and ETW.

Sysmon Event ID	Relevance
`1` (Process Create)	Children spawned under sandboxed jobs; correlate unusual `ParentImage` / `IntegrityLevel`
`10` (Process Access)	`OpenProcess` with `PROCESS_SET_QUOTA` (`0x200`) or `PROCESS_ALL_ACCESS` (`0x1fffff`) preceding job assignment
`17` / `18` (Pipe Created/Connected)	Named pipes visible across a silo namespace boundary during lateral movement

ETW Provider	What It Logs
`Microsoft-Windows-Kernel-Process`	Process/thread lifecycle; job assignments surface as `ProcessSetJobObjectInformation` events
`Microsoft-Windows-Security-Auditing`	Process creation (Event `4688` with command-line auditing)
`Microsoft-Windows-Containers-CCG`	Container credential guard events in server silos
`Microsoft-Windows-Hyper-V-Compute`	HCS / silo creation and teardown

Enable Audit Process Creation (auditpol /set /subcategory:"Process Creation" /success:enable) to produce Event 4688 with full command line, and Audit Object Access to capture named job-object handle creation as Events 4656 / 4663.

title: Suspicious Process Access Preceding Job Quota Assignment
logsource:
  product: windows
  service: sysmon
detection:
  selection:
    EventID: 10                 # Sysmon ProcessAccess
    GrantedAccess|contains:
      - '0x1fffff'              # PROCESS_ALL_ACCESS
      - '0x200'                 # PROCESS_SET_QUOTA (job assignment)
    TargetImage|contains: '\lsass.exe'
  condition: selection
level: high

Hardening guidance:

Apply JOB_OBJECT_LIMIT_KILL_ON_JOB_CLOSE in every sandbox so process trees are reaped on handle loss.
Deny JOB_OBJECT_LIMIT_BREAKAWAY_OK unless explicitly required — it is a direct escape vector.
Combine job limits with Integrity Levels and AppContainer; jobs do not restrict file or registry access.
For hostile workloads prefer Hyper-V isolation — controls apply to both the VM and the inner job object.
Monitor wcifs.sys activity in server-silo environments; it enforces filesystem isolation and is a known escape surface.
Audit named job creation under \Sessions\<N>\BaseNamedObjects\ with WinObj and Sysmon object/pipe events as a proxy.

12. MITRE ATT&CK Mapping

Technique	MITRE ID	Detection
Native API	`T1106`	ETW `Kernel-Process` job-assignment events; underpins all job/silo API use
Process Injection	`T1055`	Sysmon `Event ID 10`; handle access to constrained process groups
Impair Defenses: Disable/Modify Tools	`T1562.001`	UI-limit flags blocking security tooling; behavioural EDR telemetry
Escape to Host	`T1611`	`wcifs.sys` and `Hyper-V-Compute` ETW; primary silo/container-escape mapping
Create or Modify System Process	`T1543`	Sysmon `Event ID 1`; persistent processes wrapped in jobs
Execution Guardrails	`T1480`	Behavioural analysis of sandbox-aware payloads keyed to job limits

Verify current technique versions and sub-techniques at https://attack.mitre.org before publication.

13. Tools for Job and Silo Analysis

Tool	Description	Link
Process Explorer	View per-process job membership and limits	sysinternals
Process Hacker	Inspect job tab, member processes, and quotas	processhacker.sourceforge.io
WinObj	Browse named job objects and silo namespace roots	sysinternals
WinDbg	`!job`, `dt nt!_EJOB`, `_ESERVERSILO_GLOBALS` inspection	microsoft.com
Process Monitor	Observe `wcifs.sys` and registry-hive container activity	sysinternals
ETW (logman / wevtutil)	Capture `Kernel-Process` and `Hyper-V-Compute` events	microsoft.com

Summary

Job objects group processes into a single managed unit with enforceable CPU, memory, network, and process-count limits, all anchored on the kernel EJOB object.
Limits are applied through SetInformationJobObject using JOBOBJECT_BASIC, EXTENDED, CPU_RATE, NET_RATE, and NOTIFICATION structures; nesting (Windows 8+) tightens to the most restrictive ancestor.
Silos extend jobs via the JobFlags silo bit, adding a private object-namespace root; server silos (_ESERVERSILO_GLOBALS) back Windows containers and share the host kernel.
Abuse spans sandbox-aware keying, BREAKAWAY_OK escapes, UI-limit tool blocking, and server-silo container escape (T1611).
Detect via Sysmon Event ID 1/10, Kernel-Process and Hyper-V-Compute ETW, Event 4688 auditing, and prefer Hyper-V isolation plus KILL_ON_JOB_CLOSE for containment.

References

Windows Scheduler Internals: Priority Levels, Quantum, and Thread Selection

Objective: Understand how the Windows kernel selects, preempts, and rotates threads — the 32-level priority model, dispatcher ready queues, quantum accounting, boost/decay logic, and the multiprocessor dispatch path — so defenders can baseline normal scheduling behavior and detect attacker manipulation of priority and affinity.

1. The Scheduling Contract: Threads, Not Processes

Windows schedules threads, not processes. Every executable unit of work is represented by a KTHREAD (the Thread Control Block embedded in ETHREAD.Tcb), and the scheduler operates exclusively against that structure. A process supplies the address space, the base priority class, the quantum reset value, and the affinity mask — but it never itself runs on a CPU.

Scheduling is preemptive and priority-based with round-robin at the highest priority. Two rules dominate:

The thread with the highest priority in the Ready state always wins.
If a running thread has a lower priority than a newly Ready thread, the running thread is immediately preempted at the next dispatch point.

Quantum only matters as a tiebreaker between threads of the same highest priority — it does not arbitrate across priority levels.

2. The 32-Level Priority Model and Priority Classes

Priorities range from 0 (zero-page thread only) to 31 (highest real-time). The space splits into two bands with very different semantics.

Range	Type	Description
`0`	Zero-page thread	Reserved for the memory zero-page thread
`1–15`	Dynamic (variable)	Normal user-mode threads; subject to boost/decay
`16–31`	Real-time	Fixed priorities; no boost, no decay; drivers and RT tasks

Win32 exposes two functions to set scheduling parameters: SetPriorityClass on the process and SetThreadPriority on the thread. The two combine to produce the thread’s base priority in the kernel.

`SetPriorityClass` constant	Class	Base priority range
`IDLE_PRIORITY_CLASS`	Idle	1–6
`BELOW_NORMAL_PRIORITY_CLASS`	Below Normal	4–9
`NORMAL_PRIORITY_CLASS`	Normal	6–10
`ABOVE_NORMAL_PRIORITY_CLASS`	Above Normal	8–13
`HIGH_PRIORITY_CLASS`	High	11–15
`REALTIME_PRIORITY_CLASS`	Real-time	16–31

Crossing into the real-time band (>=16) requires the SeIncreaseBasePriorityPrivilege privilege. NT-native equivalents are NtSetInformationThread (information class ThreadBasePriority = 3) and ZwSetInformationProcess.

// Pin this process and one of its threads to real-time scheduling.
SetPriorityClass(GetCurrentProcess(), REALTIME_PRIORITY_CLASS);
SetThreadPriority(GetCurrentThread(), THREAD_PRIORITY_TIME_CRITICAL);
// Base priority now sits at 31 — preempts essentially everything in user mode.

Hierarchy diagram of the Windows 32-level thread priority model split into real-time band (16–31) and dynamic band (1–15) with priority 0 reserved for the zero-page thread — Windows priorities split at level 16 — crossing into the real-time band requires SeIncreaseBasePriorityPrivilege, and those threads are never boosted or decayed.

3. Key Kernel Structures

Three structures carry the scheduler’s state: _KTHREAD per thread, _KPROCESS per process, and _KPRCB per logical processor.

_KTHREAD (Thread Control Block)

typedef struct _KTHREAD {
    DISPATCHER_HEADER  Header;          // +0x000 dispatcher object header
    // ...
    ULONGLONG          QuantumTarget;   // +0x020 quantum expiration target
    PVOID              InitialStack;    // +0x028 top of kernel stack
    // ...
    volatile UCHAR     State;           // Ready/Running/Waiting/...
    BOOLEAN            Preempted;
    UCHAR              DeferredProcessor;
    SCHAR              Priority;        // current (dynamic) priority
    ULONG              WaitTime;
    LIST_ENTRY         WaitListEntry;
    SINGLE_LIST_ENTRY  SwapListEntry;
    KSPIN_LOCK         ThreadLock;
} KTHREAD, *PKTHREAD;

The embedded DISPATCHER_HEADER is the same header found at the top of every waitable kernel object and is what ties the thread into wait queues.

_KPRCB (Kernel Processor Control Block)

Each logical processor has a KPCR; inside it sits a KPRCB carrying that CPU’s scheduling state.

typedef struct _KPRCB {
    // ...
    PKTHREAD    CurrentThread;       // executing thread on this CPU
    PKTHREAD    NextThread;          // pending preemption candidate
    PKTHREAD    IdleThread;          // per-CPU idle thread
    LIST_ENTRY  ReadyListHead[32];   // dispatcher ready queues (per priority)
    ULONG       ReadySummary;        // bitmask of non-empty ready queues
    // ...
} KPRCB, *PKPRCB;

_KPROCESS (Process Control Block)

Embedded as EPROCESS.Pcb, it provides the per-process scheduling defaults:

Field	Purpose
`BasePriority`	Process base priority; seeds new threads
`QuantumReset`	Quantum value assigned to new threads
`ThreadListHead`	Doubly-linked list of all `_KTHREAD`s in the process
`ReadyListHead`	Ready-but-swapped-out threads

4. Dispatcher Ready Queues and ReadySummary

The Dispatcher Ready Queue is the per-CPU array KPRCB.ReadyListHead[32] — one LIST_ENTRY per priority level. Each non-empty entry is a FIFO of KTHREAD structures in the Ready state.

To avoid scanning all 32 queues, the kernel maintains a 32-bit ReadySummary bitmask: bit n is set when ReadyListHead[n] is non-empty. The dispatcher then selects the next thread in O(1):

// Conceptual scheduler inner loop (pseudo-code; not a real symbol).
ULONG mask = Prcb->ReadySummary;
if (mask) {
    ULONG idx;
    _BitScanReverse(&idx, mask);              // highest set bit = top priority
    PKTHREAD next = CONTAINING_RECORD(
        RemoveHeadList(&Prcb->ReadyListHead[idx]),
        KTHREAD, WaitListEntry);
    if (IsListEmpty(&Prcb->ReadyListHead[idx]))
        Prcb->ReadySummary &= ~(1u << idx);
    return next;
}
return Prcb->IdleThread;

5. Quantum Mechanics

A quantum is the slice of CPU time a thread is allowed to consume before the scheduler considers rotating it. WMI exposes two relevant properties: QuantumLength (clock ticks per quantum) and QuantumType (fixed vs. variable). Windows client SKUs default to variable quantum, giving the foreground process longer slices; server SKUs default to fixed long quantum to favor batch throughput.

Internally, quantum is tracked in units of 3 per clock tick — a “full” quantum is 18 units (client) or 36 units (server). KTHREAD.QuantumTarget holds the cycle target; on each clock tick, the kernel decrements and, on expiry, transfers control to KiQuantumEnd().

The foreground boost is governed by the registry value:

HKLM\SYSTEM\CurrentControlSet\Control\PriorityControl\Win32PrioritySeparation

The lowest six bits encode foreground-vs-background quantum behavior; bits 0–1 specifically choose the foreground boost level (0 none, 1 medium, 2 high). The kernel mirrors this into the global PsPrioritySeparation.

Internal scheduler routines you will see in symbols:

Function	Purpose
`KiQuantumEnd`	Invoked at clock interrupt when quantum expires
`KiSelectNextThread`	Selects next Ready thread for the current CPU
`KiDeferredReadyThread`	Places a thread in DeferredReady before final dispatch
`KxQueueReadyThread`	Inserts a thread into the per-CPU ready queue
`KiReadyThread`	Transitions a thread to the Ready state
`KiSwapThread` / `KiSwapContext`	Performs the actual context switch

6. Thread Selection: The Dispatch Path

A typical preemption follows this path:

Clock interrupt fires on the local CPU.
KiQuantumEnd() decrements KTHREAD.Quantum; if it has reached zero, the thread is moved out of Running.
KiSelectNextThread() consults KPRCB.ReadySummary to find the highest non-empty queue.
The chosen thread is removed from ReadyListHead[idx] and routed through KiDeferredReadyThread().
KxQueueReadyThread() places the preempted thread back into ReadyListHead[oldPrio] (FIFO tail) so round-robin holds within its level.
KiSwapThread() → KiSwapContext() saves outgoing register state, loads the incoming thread’s stack and registers, and returns into the new thread.

If a wake event makes a higher-priority thread Ready while another thread is Running, the dispatcher instead writes the candidate into KPRCB.NextThread, raises an IPI on the target CPU, and the preemption fires on return-from-interrupt — without waiting for quantum expiry.

Flow diagram showing the Windows thread dispatch sequence from clock interrupt through KiQuantumEnd, KiSelectNextThread scanning ReadySummary, dequeueing from ReadyListHead, and KiSwapContext to the new running thread — The O(1) dispatch path uses a highest-set-bit scan on KPRCB.ReadySummary to find the next ready thread without iterating all 32 queues.

7. Priority Boosts and Decay

Dynamic-band threads (1–15) do not stay at their base priority. The kernel temporarily boosts them in response to events and decays the boost as they consume CPU.

Event	Boost
I/O completion (keyboard / mouse)	+6
I/O completion (disk / network)	+1
Foreground window activation	controlled by `PsPrioritySeparation`
Wait satisfied on executive event	+1
Starvation avoidance (Balance Set Manager)	up to 15 for one quantum
Decay (CPU-bound thread at quantum end)	−1 toward base

The Balance Set Manager (KeBalanceSetManager) periodically scans ready queues and elevates threads that have been Ready but never running for ~4 seconds to priority 15 for a single quantum — preventing indefinite starvation by higher-priority work. Real-time threads (16–31) are never boosted or decayed; their priority is exactly what was set.

8. Multiprocessor Scheduling, Affinity, and NUMA

Each CPU has its own ready queues, so dispatch decisions are mostly local. To preserve cache and NUMA locality, the scheduler picks an ideal processor per thread and prefers to dispatch on that CPU, falling back to other CPUs in the thread’s affinity mask when the ideal is busy.

// Pin a worker thread to CPU 2, with CPU 2 as its ideal processor.
DWORD_PTR mask = (DWORD_PTR)1 << 2;
SetThreadAffinityMask(hThread, mask);
SetThreadIdealProcessor(hThread, 2);

For >64 logical processors, threads belong to processor groups, set via SetThreadGroupAffinity. Kernel-mode equivalents are KeSetSystemAffinityThread and KeSetIdealProcessorThread. Misconfigured affinity is a real performance and detection hazard — a thread pinned off-node walks remote memory and pollutes another CPU’s cache.

9. Thread States: The Full State Machine

The KTHREADSTATE enum tracks every transition. The values you will see in KTHREAD.State:

State	Meaning
`Initialized`	Thread structure created, not yet schedulable
`Ready`	Schedulable; sitting on `ReadyListHead[priority]`
`Standby`	Selected as `KPRCB.NextThread`, about to run
`Running`	Currently executing on a CPU
`Waiting`	Blocked on a dispatcher object
`Transition`	Wait satisfied, but kernel stack is paged out
`DeferredReady`	Will be made Ready on a specific CPU
`Terminated`	Final state before structure teardown

A normal cycle looks like Initialized → Ready → Standby → Running → Waiting → Ready …. KPRCB.NextThread is non-NULL exactly while a target CPU has a Standby thread queued.

Graph diagram of the Windows KTHREAD state machine showing transitions between Initialized, Ready, Standby, Running, Waiting, and Terminated states — A thread passes through Standby — held in KPRCB.NextThread — immediately before swapping onto the CPU, making Standby a precise indicator of imminent context switch.

10. Observing the Scheduler with WinDbg and ETW

Live kernel inspection in WinDbg:

0: kd> !pcr                          ; current processor control region
0: kd> !prcb                         ; current processor control block
0: kd> dt nt!_KPRCB CurrentThread NextThread ReadySummary @$prcb
0: kd> dt nt!_KTHREAD Priority Quantum State Preempted @$thread
0: kd> !ready                        ; all ready threads, sorted by priority
0: kd> !thread <addr> 1f             ; full thread state including stack

The ReadyListHead walk per-priority:

0: kd> dx -r1 ((nt!_KPRCB*)@$prcb)->ReadyListHead
0: kd> !list "-t nt!_KTHREAD.WaitListEntry.Flink -e -x \"dt nt!_KTHREAD @$extret Priority\" \
        ((nt!_KPRCB*)@$prcb)->ReadyListHead[15].Flink"

For live system-wide capture, use ETW:

xperf -on PROC_THREAD+LOADER+CSWITCH+DISPATCHER -stackwalk CSwitch
xperf -d sched.etl

The primary providers carrying scheduler telemetry:

Provider	GUID	Key events
`Microsoft-Windows-Kernel-Process`	`{22FB2CD6-0E7B-422B-A0C7-2FAD1FD0E716}`	`CSwitch` (36), `ReadyThread` (50)
`Microsoft-Windows-Kernel-Thread`	`{3D6FA8D1-FE05-11D0-9DDA-00C04FD7BA7C}`	Thread create/terminate, priority change
`NT Kernel Logger`	`{9E814AAD-3204-11D2-9A82-006008A86939}`	`CSWITCH`, `DISPATCHER` groups

A user-mode helper to enumerate per-thread priority without OpenProcess:

import ctypes
from ctypes import wintypes

ntdll = ctypes.WinDLL("ntdll")
# SystemProcessInformation = 5; walks _SYSTEM_PROCESS_INFORMATION entries
# Each entry trails an array of SYSTEM_THREAD_INFORMATION with Priority/BasePriority.
buf = (ctypes.c_byte * (1024 * 1024))()
ret_len = wintypes.ULONG()
ntdll.NtQuerySystemInformation(5, buf, ctypes.sizeof(buf), ctypes.byref(ret_len))
# parse _SYSTEM_PROCESS_INFORMATION + _SYSTEM_THREAD_INFORMATION here

11. Common Attacker Techniques

Scheduler manipulation is rarely a standalone objective — it is a force multiplier for injection, evasion, and defense impairment.

Technique	Description
Thread execution hijacking	`OpenThread` → `SuspendThread` → `VirtualAllocEx` + `WriteProcessMemory` → `SetThreadContext` → `ResumeThread`. Post-resume, attacker controls priority and CPU affinity of the hijacked thread.
Real-time priority abuse	Set malicious thread to `THREAD_PRIORITY_TIME_CRITICAL` under `REALTIME_PRIORITY_CLASS` (priority 31) to dominate the CPU and starve EDR scanners. Requires `SeIncreaseBasePriorityPrivilege`.
EDR/AV starvation	Open handles to defender process threads with `THREAD_SET_INFORMATION` and downgrade them via `SetThreadPriority(THREAD_PRIORITY_IDLE)` to delay real-time detection.
Affinity pinning for evasion	Pin malicious threads to a CPU not covered by an EDR’s per-CPU sampling profiler, or off-NUMA-node, to skew profilers and ETW stack walks.
`Win32PrioritySeparation` tampering	Modify the registry value to alter foreground boost behavior, hurting interactive defensive tooling.
Quantum throttling via Job Objects	Apply `JOB_OBJECT_CPU_RATE_CONTROL` to constrain a defender process’s CPU budget.

Conceptual illustration of attacker thread priority manipulation showing a high-priority red thread overwhelming lower-priority blue threads on a CPU grid — Elevating a malicious thread to real-time priority can starve EDR sensor threads, delay telemetry, and create execution windows for in-memory payloads.

12. Defensive Strategies & Detection

Scheduler-level abuse is observable through ETW context-switch streams, sensitive-privilege auditing, registry auditing, and process-access telemetry. Sysmon alone is insufficient — pair it with kernel ETW.

Sysmon Event ID	Name	Relevance
`1`	Process Create	Captures process priority class and parent lineage
`8`	CreateRemoteThread	Cross-process thread creation; often precedes priority manipulation
`10`	ProcessAccess	`OpenThread` with `THREAD_SET_INFORMATION` indicates intent to alter priority/context
`13`	RegistryValueSet	Modification of `Win32PrioritySeparation` and other PriorityControl values

Critical Windows audit events:

4673 — Sensitive Privilege Use. Catches SeIncreaseBasePriorityPrivilege invocation, required for real-time priority.
4656 / 4663 — Handle/Object Access. Catches handles opened to thread objects with THREAD_SET_INFORMATION.
4657 — Registry value modified. Catches Win32PrioritySeparation changes.
4688 — Process creation (with command-line auditing enabled).

Conceptual Sigma rule for unexpected real-time priority use:

title: Sensitive Privilege Use - SeIncreaseBasePriorityPrivilege from Non-System Process
logsource:
  product: windows
  service: security
detection:
  selection:
    EventID: 4673
    PrivilegeList|contains: 'SeIncreaseBasePriorityPrivilege'
  filter_system:
    SubjectUserSid:
      - 'S-1-5-18'   # LocalSystem
      - 'S-1-5-19'   # LocalService
      - 'S-1-5-20'   # NetworkService
  condition: selection and not filter_system
level: high

Hardening checklist:

Restrict SeIncreaseBasePriorityPrivilege via Group Policy → User Rights Assignment to only the accounts that require it.
Audit Win32PrioritySeparation with Sysmon Event ID 13 or registry SACL → Event ID 4657.
Baseline CSwitch priority distributions via ETW; alert on sustained user-mode threads scheduled at priority ≥ 16 outside an allowlist.
Deploy EDR that registers PsSetCreateThreadNotifyRoutine and ObRegisterCallbacks to observe thread creation, handle stripping, and priority changes in kernel.
Enclose untrusted code in Job Objects with JobObjectCpuRateControlInformation and basic UI restrictions to prevent it from starving other processes.

13. Tools for Scheduler Analysis

Tool	Description	Link
WinDbg (kernel)	`!ready`, `!thread`, `!pcr`, `!prcb`, `dt nt!_KTHREAD/_KPRCB` for live scheduler inspection	learn.microsoft.com
Windows Performance Recorder / xperf	Captures `CSwitch`, `ReadyThread`, `DISPATCHER` ETW events with stack walks	learn.microsoft.com
Windows Performance Analyzer	Visualizes CPU usage, context switches, and per-thread priority timelines	learn.microsoft.com
Process Hacker / System Informer	Live per-thread state, base priority, dynamic priority, ideal CPU, affinity mask	systeminformer.sourceforge.io
Process Explorer	Per-thread CPU, priority class, kernel/user stacks	sysinternals.com
Process Monitor	Captures `Process Create`, registry writes (`Win32PrioritySeparation`)	sysinternals.com
Sysmon	Events `1`, `8`, `10`, `13` for thread creation, cross-process access, registry edits	sysinternals.com
Volatility 3	Offline thread enumeration (`windows.threads`) and priority analysis from memory dumps	volatilityfoundation.org

14. MITRE ATT&CK Mapping

Technique	MITRE ID	Detection
Process Injection	`T1055`	Sysmon `10` (ProcessAccess), ETW thread create with foreign-process parentage
Thread Execution Hijacking	`T1055.003`	Sysmon `10` with `THREAD_SET_INFORMATION` / `THREAD_SET_CONTEXT` access; `SuspendThread`/`ResumeThread` pairs in EDR telemetry
Scheduled Task / Job	`T1053`	Audit `4698` for task creation; monitor Job Object CPU-rate limits applied to defensive processes
Impair Defenses: Disable or Modify Tools	`T1562.001`	Sysmon `10` against AV/EDR `lsass.exe`, `MsMpEng.exe` with `THREAD_SET_INFORMATION`; priority drops via ETW `Microsoft-Windows-Kernel-Thread`

Note: ATT&CK does not currently track “Thread Priority Manipulation” as a standalone technique. Treat priority abuse as a sub-mechanism of T1055.003 and T1562.001.

15. Summary

Windows is a preemptive, priority-based thread scheduler with 32 levels and per-CPU ready queues — priority always wins, quantum only rotates equal-priority threads.
The dispatcher uses KPRCB.ReadySummary plus ReadyListHead[32] to pick the next thread in O(1) via highest-set-bit scan.
Quantum is tracked in 3-unit-per-tick increments on KTHREAD.QuantumTarget, with foreground boost governed by Win32PrioritySeparation / PsPrioritySeparation.
Dynamic threads (1–15) are subject to I/O, foreground, and starvation boosts plus decay; real-time threads (16–31) are not.
Attackers abuse the scheduler via thread hijacking, real-time priority escalation, EDR starvation, and affinity pinning — detect via ETW CSwitch, Sysmon 8/10/13, and Event ID 4673 for SeIncreaseBasePriorityPrivilege.

References

APCs: Asynchronous Procedure Calls and Thread Hijacking Surface

Objective: Understand the Windows Asynchronous Procedure Call mechanism from the kernel up — the KAPC / KAPC_STATE structures, the dispatch path through KiInsertQueueApc and KiDeliverApc, the alertable-wait requirement, and the three abuse variants (classic, early-bird, special user APC) used for thread hijacking and process injection — and detect them with Sysmon, ETW-TI, and audit policy.

1. APC Fundamentals — What the OS Actually Uses APCs For

An Asynchronous Procedure Call is a function that executes asynchronously in the context of a specific thread. When the kernel queues an APC, it raises a software interrupt and arranges for the routine to run the next time that thread is dispatched. Every thread has its own APC queue — APCs are inherently thread-targeted, which is exactly why offensive tooling loves them.

The OS itself relies on APCs for normal work:

I/O completion: ReadFileEx, WriteFileEx, and SetWaitableTimer deliver their completion callback via a user-mode APC queued back to the issuing thread.
File-system filter callbacks: normal kernel APCs are widely used by file systems and minifilters.
Wait abortion: queuing a user APC against a thread in an alertable wait satisfies the wait with STATUS_USER_APC.

Understanding APCs means understanding three things in sequence: who can queue them, when they fire, and what the thread looks like at the moment they fire.

2. The Three Flavours of APCs

APCs differ by IRQL and by who is allowed to queue them. The kernel maintains distinct semantics for each.

Type	IRQL	Notes
Special Kernel APC	`APC_LEVEL`	Runs in kernel mode at IRQL `APC_LEVEL`; preempts user-mode code and kernel-mode code executing at `PASSIVE_LEVEL`. Used by the OS for operations such as I/O request completion.
Normal Kernel APC	`PASSIVE_LEVEL`	Runs in kernel mode at `PASSIVE_LEVEL`; preempts all user-mode code, including user APCs. Generally used by file systems and file-system filter drivers.
User-mode APC	`PASSIVE_LEVEL`	Generated by an application. The target thread must be in an alertable state for a user-mode APC to run.

Unlike deferred procedure calls (DPCs), which run in arbitrary thread context, an APC always executes inside a specific thread’s context — that property is what makes APCs both useful for I/O completion and dangerous as an injection primitive.

Hierarchy diagram showing the three APC types: Kernel-Mode, User-Mode, and Special User APC, with their respective queuing APIs and alertable-wait requirements — The three APC flavours differ by privilege level, delivery trigger, and the Win32/native APIs used to queue them.

3. Kernel Structures: `KAPC`, `KAPC_STATE`, `KTHREAD`

A queued APC is represented in the kernel by a KAPC object. The thread tracks its pending APCs via a KAPC_STATE embedded in KTHREAD.

// Conceptual layout — field names are illustrative; confirm against the
// target Windows build with `dt nt!_KAPC` / `dt nt!_KAPC_STATE` in WinDbg.

typedef struct _KAPC {
    UCHAR              Type;
    UCHAR              SpareByte0;
    UCHAR              Size;
    UCHAR              SpareByte1;
    ULONG              SpareLong0;
    struct _KTHREAD   *Thread;
    LIST_ENTRY         ApcListEntry;
    PKKERNEL_ROUTINE   KernelRoutine;
    PKRUNDOWN_ROUTINE  RundownRoutine;
    PKNORMAL_ROUTINE   NormalRoutine;
    PVOID              NormalContext;
    PVOID              SystemArgument1;
    PVOID              SystemArgument2;
    CCHAR              ApcStateIndex;
    KPROCESSOR_MODE    ApcMode;
    BOOLEAN            Inserted;
} KAPC, *PKAPC;

typedef struct _KAPC_STATE {
    LIST_ENTRY         ApcListHead[2];   // [0] = kernel APCs, [1] = user APCs
    struct _KPROCESS  *Process;
    BOOLEAN            KernelApcInProgress;
    BOOLEAN            KernelApcPending;
    BOOLEAN            UserApcPending;
    // SpecialUserApcPending was added later for RS5+ Special User APCs.
} KAPC_STATE, *PKAPC_STATE;

Key fields the dispatcher and attackers both care about:

KAPC.NormalRoutine — the function the thread will eventually execute.
KAPC.NormalContext, SystemArgument1, SystemArgument2 — arguments passed to NormalRoutine.
KAPC.ApcMode — KernelMode vs UserMode, controls which queue and which delivery path.
KAPC_STATE.ApcListHead[2] — two doubly-linked lists; index 0 holds kernel-mode APCs, index 1 holds user-mode APCs.
KAPC_STATE.UserApcPending — set to TRUE when a user APC is queued and the thread is in an alertable wait; this is the signal that breaks the wait with STATUS_USER_APC.

4. The Alertable Wait Requirement

A user-mode APC does not fire whenever the kernel wants — it fires only when the target thread is willing to be interrupted. A thread enters an alertable state by calling one of:

SleepEx()
SignalObjectAndWait()
MsgWaitForMultipleObjectsEx()
WaitForMultipleObjectsEx()
WaitForSingleObjectEx()

with the bAlertable parameter set to TRUE. Additionally, ReadFileEx, WriteFileEx, and SetWaitableTimer are themselves implemented using APCs as their completion-notification mechanism — so threads driving overlapped I/O routinely sit in alertable waits.

This alertable-state requirement is the single most important property to understand offensively and defensively:

Offensively, it dictates target selection. Long-lived service threads in svchost.exe or explorer.exe that pump I/O are reliable targets; threads that never enter an alertable wait will never run a queued user APC.
Defensively, it explains why the classic injection works against some processes and not others — and why attackers eventually moved to Special User APCs to remove the dependency entirely (§9).

5. Win32 → Native → Kernel Call Chain

Queuing a user APC traverses three layers.

API / Symbol	Layer	Description
`QueueUserAPC`	Win32 (`kernel32.dll`)	Queues a user-mode APC to a target thread.
`NtQueueApcThread`	NT native (`ntdll.dll`)	Syscall used internally by `QueueUserAPC` to deliver the APC.
`NtQueueApcThreadEx`	NT native	Extended form; RS5 introduced Special User APCs queued by passing `1` as the reserve handle.
`NtQueueApcThreadEx2`	NT native	Newer variant exposing both `UserApcFlags` and `MemoryReserveHandle`.
`QueueUserAPC2`	`kernelbase.dll`	Wrapper that exposes Special User APCs to user code.
`KeInsertQueueApc`	Kernel	Attaches the initialized `KAPC` to the target thread’s queue.
`KiDeliverApc`	Kernel	Dispatches pending APCs at the kernel→user transition.
`ntdll!RtlDispatchAPC`	ntdll	Trampoline in user mode that calls the caller-supplied `APCProc`.

An important internal detail: when you call QueueUserAPC(pfn, hThread, dwData), the function pointer ntdll actually hands to NtQueueApcThread is not your pfn — it is ntdll!RtlDispatchAPC, and your pfn is passed as a parameter. This is why call-stack-aware EDRs frequently see RtlDispatchAPC as the immediate caller of the suspicious user-mode routine.

The dispatch sequence for a user-mode APC:

Caller obtains a thread handle with THREAD_SET_CONTEXT access.
QueueUserAPC → NtQueueApcThread → kernel enters KiInsertQueueApc.
KiInsertQueueApc checks whether the target is in an alertable wait with WaitMode == UserMode. If yes, it sets UserApcPending = TRUE and completes the wait with STATUS_USER_APC.
On the kernel→user transition, KiDeliverApc redirects execution to ntdll!RtlDispatchAPC, which invokes the original APCProc.

Flow diagram of the APC dispatch chain from QueueUserAPC through NtQueueApcThread, KiInsertQueueApc, KiDeliverApc, RtlDispatchAPC, to the final APCProc callback — Every layer of the APC dispatch chain is observable; EDRs see RtlDispatchAPC as the immediate caller of the injected routine.

6. Inspecting APC State in WinDbg

Read-only kernel introspection lets defenders and learners watch the structures the dispatcher mutates.

0: kd> !process 0 0 lsass.exe
0: kd> .process /r /p <EPROCESS>
0: kd> !thread <ETHREAD>

0: kd> dt nt!_KTHREAD <addr> ApcState
0: kd> dt nt!_KAPC_STATE <addr+offset>
   +0x000 ApcListHead       : [2] _LIST_ENTRY
   +0x020 Process           : Ptr64 _KPROCESS
   +0x028 KernelApcInProgress : UChar
   +0x029 KernelApcPending  : UChar
   +0x02a UserApcPending    : UChar

0: kd> !list "-t nt!_KAPC.ApcListEntry.Flink -e -x \"dt nt!_KAPC @$extret\" <ApcListHead[1]>"

Walking ApcListHead[1] for any thread reveals every pending user APC — its NormalRoutine, NormalContext, and ApcMode. On a healthy thread you typically see nothing; finding NormalRoutine pointing into a private RX region inside a system process is a classic incident-response artifact.

7. Classic APC Injection

The textbook variant. Every API call below is observable; the technique relies entirely on existing, documented APIs.

// Educational illustration of the API call chain only.
// No payload is included; `payload` is a placeholder used by defenders to
// recognize the pattern. Authorized testing only.

#include <windows.h>
#include <tlhelp32.h>

BOOL InjectViaAPC(DWORD pid, DWORD tid, const BYTE *payload, SIZE_T cb) {
    HANDLE hProc = OpenProcess(
        PROCESS_VM_OPERATION | PROCESS_VM_WRITE | PROCESS_QUERY_INFORMATION,
        FALSE, pid);
    if (!hProc) return FALSE;

    HANDLE hThread = OpenThread(THREAD_SET_CONTEXT, FALSE, tid);
    if (!hThread) { CloseHandle(hProc); return FALSE; }

    LPVOID remote = VirtualAllocEx(hProc, NULL, cb,
                                   MEM_COMMIT | MEM_RESERVE,
                                   PAGE_EXECUTE_READWRITE);
    WriteProcessMemory(hProc, remote, payload, cb, NULL);

    // QueueUserAPC schedules execution; it fires only when the target
    // thread enters an alertable wait.
    QueueUserAPC((PAPCFUNC)remote, hThread, 0);

    CloseHandle(hThread);
    CloseHandle(hProc);
    return TRUE;
}

Trigger conditions:

The target thread (tid) must enter an alertable wait. In long-lived service hosts this happens routinely.
The handle to the thread must carry THREAD_SET_CONTEXT. This is the most reliable single indicator: Sysmon EID 10 with a GrantedAccess mask covering THREAD_SET_CONTEXT against a high-value target image is the canonical detection (§12).

Notably, no new thread is created in the victim process — CreateRemoteThread is not called. This is exactly why APC injection evades Sysmon EID 8.

8. Early-Bird APC Injection

Classic injection has one weakness: you cannot predict when the victim thread will next become alertable. Early-bird removes the guesswork by injecting into a process you create yourself in a suspended state, then queuing the APC against the main thread before it has executed a single instruction.

// Educational pseudocode — illustrates API sequence, not payload.

STARTUPINFOA si = { sizeof(si) };
PROCESS_INFORMATION pi = { 0 };

CreateProcessA(NULL, "C:\\Windows\\System32\\notepad.exe", NULL, NULL,
               FALSE, CREATE_SUSPENDED, NULL, NULL, &si, &pi);

LPVOID remote = VirtualAllocEx(pi.hProcess, NULL, cb,
                               MEM_COMMIT | MEM_RESERVE,
                               PAGE_EXECUTE_READWRITE);
WriteProcessMemory(pi.hProcess, remote, payload, cb, NULL);

QueueUserAPC((PAPCFUNC)remote, pi.hThread, 0);

// Thread services its APC queue as part of initialization, *before*
// running the original entry point.
ResumeThread(pi.hThread);

Why it works: when a newly created thread starts, the kernel transitions into user mode through ntdll!LdrInitializeThunk, which performs internal alertable waits during loader work. Any user APC queued before ResumeThread is delivered during that early window — before the legitimate entry point runs.

This variant straddles two ATT&CK sub-techniques: it is APC injection (T1055.004) but it also resembles Thread Execution Hijacking (T1055.003) because the suspended-thread-then-redirect pattern is structurally the same primitive.

Flow diagram of the Early-Bird APC injection sequence showing CreateProcess in suspended state, memory staging, APC queuing, ResumeThread, and payload execution before the legitimate entry point — Early-Bird queues the APC before the main thread has executed a single instruction, exploiting the alertable waits inside LdrInitializeThunk.

9. Special User APCs (RS5+): Bypassing the Alertable Requirement

Starting with Windows 10 RS5, the kernel introduced Special User APCs. The key behavioural change: these APCs are delivered with Mode == KernelMode to force a thread signal. The thread is interrupted mid-execution to run the special APC — the alertable-state requirement is gone.

They are queued via NtQueueApcThreadEx (passing 1 as the reserve handle) or through NtQueueApcThreadEx2, which exposes a flags field. kernelbase!QueueUserAPC2 is the documented Win32 wrapper.

// Conceptual signatures — confirm flag values and syscall semantics
// against the target SDK / Windows build before relying on them.

typedef NTSTATUS (NTAPI *pNtQueueApcThreadEx2)(
    HANDLE         ThreadHandle,
    HANDLE         UserApcReserveHandle,   // optional reserve object
    ULONG          ApcFlags,               // e.g. QUEUE_USER_APC_FLAGS_SPECIAL_USER_APC
    PVOID          ApcRoutine,
    PVOID          SystemArgument1,
    PVOID          SystemArgument2,
    PVOID          SystemArgument3);

// Pseudocode dispatch — `Special User APC` interrupts a running thread
// without requiring it to be in SleepEx / WaitForSingleObjectEx.
pNtQueueApcThreadEx2 fn = (pNtQueueApcThreadEx2)
    GetProcAddress(GetModuleHandleW(L"ntdll.dll"), "NtQueueApcThreadEx2");

fn(hThread,
   NULL,
   QUEUE_USER_APC_FLAGS_SPECIAL_USER_APC,   // forces in-execution delivery
   remote_routine,
   NULL, NULL, NULL);

Internally the kernel sets SpecialUserApcPending (added to KAPC_STATE for this purpose) and arranges delivery at the next return-to-user-mode opportunity regardless of wait state. This is a meaningful escalation of the primitive — it converts APC injection from “wait until the thread cooperates” to “interrupt the thread now.”

10. Real-World Threat Actor Usage

APC injection is documented at the technique level rather than the family level here; defenders should treat it as a primitive that recurs across many tradecraft variants:

DOUBLEPULSAR used kernel-mode APC injection to redirect user-mode threads from a kernel implant.
Multiple commodity and APT families catalogued under MITRE T1055.004 employ classic user-APC injection against svchost.exe, explorer.exe, and other long-running hosts.
The AtomBombing family of injection variants combines GlobalAddAtom/NtQueueApcThread to stage code through atom tables, then dispatch via APC.
Recent research (Check Point’s Thread Name-Calling) chains thread-name primitives with APC dispatch to evade EDR userland hooks.

11. Common Attacker Techniques

Technique	Description
Classic APC Injection	`OpenProcess` → `OpenThread(THREAD_SET_CONTEXT)` → `VirtualAllocEx` → `WriteProcessMemory` → `QueueUserAPC`. Fires when the target thread next enters an alertable wait.
Early-Bird APC	`CreateProcess(CREATE_SUSPENDED)` → write payload → `QueueUserAPC` → `ResumeThread`. APC fires during loader init, before the entry point.
Special User APC	`NtQueueApcThreadEx` / `NtQueueApcThreadEx2` with `QUEUE_USER_APC_FLAGS_SPECIAL_USER_APC` — interrupts the thread mid-execution; no alertable wait required.
Kernel APC injection from a driver	Malicious driver calls `KeInsertQueueApc` directly against a user thread (DOUBLEPULSAR class). Mitigated by HVCI / driver signing.
Atom-table staged APC (AtomBombing)	Payload bytes shuttled into target via atom tables, then dispatched with `NtQueueApcThread`. Evades naive memory-write detections.
Self-APC for unhooking / staging	Queue an APC to the current thread + `SleepEx(0, TRUE)` to execute code outside hooked call paths.

12. Defensive Strategies & Detection

APC injection is deliberately quiet — it does not create a remote thread and so does not emit Sysmon EID 8. Detection therefore pivots on the handle-acquisition and memory-staging stages, plus dedicated ETW.

12.1 Sysmon

Event ID	Name	Why It Matters Here
EID 10	`ProcessAccess`	Captures the `OpenThread`/`OpenProcess` step. `GrantedAccess` masks covering `THREAD_SET_CONTEXT` (`0x0018`) and `PROCESS_VM_WRITE` (`0x0020`) against high-value images are the strongest signal.
EID 8	`CreateRemoteThread`	Will not fire for pure APC injection — but does fire for hybrid variants and is useful as a negative signal.
EID 1	`ProcessCreate`	Detects `CREATE_SUSPENDED` parent/child pairs typical of Early-Bird. Combine with short process lifetimes.

12.2 ETW — `Microsoft-Windows-Threat-Intelligence`

The Threat Intelligence ETW provider exposes a dedicated APC-injection sensor:

THREATINT_QUEUEUSERAPC_REMOTE_KERNEL_CALLER — logged by EtwTiLogInsertQueueUserApc / EtwTiLogQueueApcThread, invoked from inside KeInsertQueueApc. Introduced in Windows 10 build 1809.

Consumption requires a signed ELAM driver; the provider is reserved for AntiMalware-protected processes. In practice you receive this telemetry through your EDR vendor’s sensor.

12.3 Audit Policy

Enable Detailed Tracking → Audit Process Access → Security log EIDs 4656 / 4663 on handle requests. Filter for Object Type = Thread with access masks including THREAD_SET_CONTEXT.
Enable Audit Process Creation → EID 4688 with full command-line logging. Pair with CREATE_SUSPENDED heuristics where parent process behaviour permits inference.

12.4 Sigma Detection (Conceptual)

title: Suspicious Cross-Process Handle Acquisition Consistent With APC Injection
id: 00000000-0000-0000-0000-000000000000
status: experimental
logsource:
  product: windows
  service: sysmon
detection:
  selection_thread_ctx:
    EventID: 10
    GrantedAccess|contains:
      - '0x0018'    # THREAD_SET_CONTEXT | THREAD_GET_CONTEXT
      - '0x1fffff'  # PROCESS_ALL_ACCESS
    TargetImage|endswith:
      - '\lsass.exe'
      - '\svchost.exe'
      - '\explorer.exe'
      - '\winlogon.exe'
  selection_vm_write:
    EventID: 10
    GrantedAccess|contains: '0x0020'   # PROCESS_VM_WRITE
  timeframe: 5s
  condition: selection_thread_ctx and selection_vm_write
falsepositives:
  - Endpoint security products and legitimate debuggers
level: high

12.5 Behavioural Heuristics

The fingerprint that hunts well: VirtualAllocEx (RWX) → WriteProcessMemory → NtQueueApcThread issued by the same source process within a short window. Even when individual calls are noisy, the ordering is rare in benign software.

12.6 PowerShell — Hunt for Suspicious `ProcessAccess` Masks

Get-WinEvent -LogName 'Microsoft-Windows-Sysmon/Operational' -FilterXPath @"
*[System[EventID=10]]
"@ |
  Where-Object {
      $_.Properties[5].Value -match '0x0018|0x001f|0x1fffff' -and
      $_.Properties[6].Value -match 'lsass\.exe|svchost\.exe|winlogon\.exe'
  } |
  Select-Object TimeCreated,
                @{n='Source'; e={$_.Properties[4].Value}},
                @{n='Target'; e={$_.Properties[6].Value}},
                @{n='Access';e={$_.Properties[5].Value}}

12.7 Hardening

Mitigation	Description
Protected Process Light (PPL)	LSASS as `PPL-Antimalware` blocks `OpenThread(THREAD_SET_CONTEXT)` from untrusted callers.
Credential Guard	Moves LSASS secrets into a VSM-isolated process, removing it as an APC target entirely.
HVCI / Code Integrity	Prevents unsigned kernel drivers from calling `KeInsertQueueApc` against arbitrary threads.
ASR rule `9e6c4e1f-7d60-472f-ba1a-a39ef669e4b0`	Blocks credential theft from LSASS; complements but does not directly block APC injection.
Minimize alertable waits in sensitive code	Avoid `SleepEx(n, TRUE)` and other alertable waits in privileged service threads unless required.
ETW-TI via EDR	Deploy AV/EDR with an ELAM driver to consume `Microsoft-Windows-Threat-Intelligence` events in real time.

Graph diagram mapping four detection controls — Sysmon EID 10, ETW-TI, Audit EID 4656, and behavioural sequencing — plus hardening measures against the APC injection threat — Because APC injection skips CreateRemoteThread, detection pivots to handle-acquisition telemetry and dedicated ETW-TI sensors rather than Sysmon EID 8.

13. Tools for APC Analysis

Tool	Description	Link
WinDbg	Walk `KTHREAD.ApcState`, dump `KAPC` entries via `!list`, inspect `UserApcPending`.	microsoft.com
Process Hacker	Per-thread inspection, including private RX allocations and thread call stacks indicative of injected code.	processhacker.sourceforge.io
Sysmon	EID 10 / 8 / 1 telemetry for the handle-open and process-creation halves of the chain.	sysinternals.com
Sysinternals `handle.exe`	Enumerate handles a suspect process holds (look for foreign `Thread` / `Process` handles).	sysinternals.com
Volatility 3	Memory forensics: walk thread APC queues post-incident; identify injected RX regions.	volatilityfoundation.org
ETW Explorer / SilkETW	Inspect or subscribe to ETW providers (ETW-TI requires signed ELAM).	github.com
x64dbg	User-mode dynamic analysis of `QueueUserAPC` / `RtlDispatchAPC` call chains.	x64dbg.com

14. MITRE ATT&CK Mapping

Technique	MITRE ID	Detection
Process Injection	T1055	Behavioural sequence: cross-process handle with VM-write rights followed by APC queuing.
Process Injection: Asynchronous Procedure Call	T1055.004	Sysmon EID 10 with `THREAD_SET_CONTEXT`; ETW-TI `THREATINT_QUEUEUSERAPC_REMOTE_KERNEL_CALLER`.
Thread Execution Hijacking	T1055.003	Early-Bird variant: `CREATE_SUSPENDED` process + `THREAD_SET_CONTEXT` handle + early-window APC.

T1055.004 is the primary mapping for this tutorial. The Early-Bird variant (§8) overlaps with T1055.003 because the suspended-thread + redirection structure is the same primitive — defenders should detect both.

Summary

APCs are a legitimate kernel facility for thread-targeted asynchronous work, and that property is exactly what makes them a first-class injection primitive.
The dispatch chain is QueueUserAPC → NtQueueApcThread → KiInsertQueueApc → KiDeliverApc → ntdll!RtlDispatchAPC → caller routine; every layer is observable.
User APCs require an alertable wait; Early-Bird sidesteps this via CREATE_SUSPENDED, and Special User APCs (NtQueueApcThreadEx2 + QUEUE_USER_APC_FLAGS_SPECIAL_USER_APC) eliminate the requirement entirely.
APC injection deliberately evades Sysmon EID 8 — detection pivots on EID 10 with THREAD_SET_CONTEXT (0x0018) and PROCESS_VM_WRITE (0x0020) against high-value targets, plus Microsoft-Windows-Threat-Intelligence ETW (EtwTiLogInsertQueueUserApc).
Map to T1055.004 for classic / special-user APC, and additionally to T1055.003 for the Early-Bird suspended-thread variant; harden with PPL, Credential Guard, HVCI, and ETW-TI-consuming EDR.

References

DPCs: Deferred Procedure Calls and Interrupt Deferral

Objective: Understand how the Windows kernel uses Deferred Procedure Calls (DPCs) to move work out of high-IRQL interrupt service routines down to DISPATCH_LEVEL, covering the KDPC structure, IRQL mechanics, the full queue-to-callback lifecycle, threaded and timer DPCs, the DPC watchdog, and how defenders detect kernel-mode abuse of the DPC mechanism.

1. The Interrupt Deferral Problem

When a hardware device raises an interrupt, the kernel dispatches to an Interrupt Service Routine (ISR) running at DIRQL — a device IRQL higher than the scheduler itself. At that level the processor cannot wait, cannot touch pageable memory, and blocks all lower-priority interrupts on that CPU. An ISR that lingers degrades the entire system; the guidance is that ISRs should not run longer than 25 microseconds.

Windows therefore uses a two-phase interrupt model. The ISR does the minimum work needed to quiesce the device (acknowledge the interrupt, snapshot status), then schedules a Deferred Procedure Call to perform the heavier processing later, at a gentler IRQL. The DPC executes at DISPATCH_LEVEL, which is still too high for anything that touches pageable memory — but it is low enough to run the bulk of device servicing without starving other interrupts.

The essence of the DPC is deferring execution to gentler circumstances. It is the kernel’s primary tool for keeping ISRs short.

2. IRQL Levels: A Precise Map

The Interrupt Request Level (IRQL) is a per-processor priority that determines what code may run and what it may do. Any routine running at DISPATCH_LEVEL or above is not preemptable, runs to completion, and must reside in non-paged memory.

IRQL Name	Value	Notes
`PASSIVE_LEVEL`	0	Normal user/kernel thread execution; paging and waiting allowed
`APC_LEVEL`	1	Asynchronous Procedure Calls
`DISPATCH_LEVEL`	2	DPC execution, scheduler, spin locks; no paging, no waiting
`DIRQL`	3–11 (device-dependent)	Hardware ISRs run here

An ISR at DIRQL cannot call functions that require PASSIVE_LEVEL. It instead schedules a DPC, which the kernel later runs at DISPATCH_LEVEL. Because DISPATCH_LEVEL still forbids page faults and blocking waits, a DPC routine and all data it touches must be non-paged.

Hierarchy diagram showing Windows IRQL levels from DIRQL at the top down through DISPATCH_LEVEL where DPCs run, APC_LEVEL, and PASSIVE_LEVEL at the bottom, with arrows showing how ISRs queue DPCs that drain at DISPATCH_LEVEL — The IRQL ladder: ISRs fire at DIRQL and defer heavy work via DPCs, which the kernel drains at DISPATCH_LEVEL before returning to lower IRQLs.

3. The KDPC Structure Dissected

The KDPC is the structure in which the kernel keeps the state of a Deferred Procedure Call. It has always been explicitly undocumented — Microsoft labels it an opaque structure and warns drivers not to set members directly. The published layout from WDK/OSR headers is:

typedef struct _KDPC {
    UCHAR                 Type;            // DpcObject or ThreadedDpcObject
    UCHAR                 Importance;      // Low / Medium / High
    USHORT                Number;          // target processor (directed DPCs)
    LIST_ENTRY            DpcListEntry;    // links into per-processor DPC queue
    PKDEFERRED_ROUTINE    DeferredRoutine; // pointer to the callback function
    PVOID                 DeferredContext; // driver-supplied context value
    PVOID                 SystemArgument1; // extra arg passed to callback
    PVOID                 SystemArgument2; // extra arg passed to callback
    __volatile PVOID      DpcData;         // internal; pointer to KDPC_DATA
} KDPC, *PKDPC, *PRKDPC;

Field	Purpose
`Type`	Distinguishes a normal `DpcObject` from a `ThreadedDpcObject`
`Importance`	Controls queue insertion: `MediumImportance` = tail, `HighImportance` = head
`Number`	Target logical processor, set via `KeSetTargetProcessorDpc`
`DeferredRoutine`	Pointer to the `KDEFERRED_ROUTINE` callback
`DeferredContext`	Opaque context the driver receives back in the callback
`SystemArgument1/2`	Caller-supplied arguments passed through to the callback
`DpcData`	Volatile internal pointer to the per-processor `KDPC_DATA`; non-NULL while queued

The DpcData field is the kernel’s bookkeeping anchor: before Windows 8.1 it pointed directly at a KDPC_DATA structure, and its non-NULL state indicates the DPC is currently queued. Because DeferredRoutine is a raw function pointer inside a writable structure, it is also a corruption target — covered in §10.

4. The DPC Lifecycle: From ISR to Callback

A DPC moves through four stages: allocate → initialize → queue → drain.

API Function	Purpose
`KeInitializeDpc`	Initializes a `KDPC`, binding a `DeferredRoutine` and `DeferredContext`
`KeInsertQueueDpc`	Inserts the `KDPC` into the per-processor queue; returns `FALSE` if already queued
`IoRequestDpc`	Convenience wrapper called from ISR context for the `DpcForIsr` pattern
`KeRemoveQueueDpc`	Removes a pending (not-yet-fired) DPC from the queue

Kernel code first allocates a KDPC in non-paged pool (or the device extension) so the object is resident when referenced from the ISR.

// C1 — allocate and initialize a DPC object
PKDPC pDpc = ExAllocatePool2(POOL_FLAG_NON_PAGED, sizeof(KDPC), 'cpDD');
if (pDpc) {
    KeInitializeDpc(pDpc, MyCustomDpc, DeviceContext);  // routine + context
}

The callback must match the KDEFERRED_ROUTINE signature and runs at DISPATCH_LEVEL:

// C2 — DPC callback stub
VOID MyCustomDpc(
    _In_     PKDPC Dpc,
    _In_opt_ PVOID DeferredContext,
    _In_opt_ PVOID SystemArgument1,
    _In_opt_ PVOID SystemArgument2)
{
    UNREFERENCED_PARAMETER(Dpc);
    ASSERT(KeGetCurrentIrql() == DISPATCH_LEVEL);   // invariant
    // Non-paged, bounded work only — no waits, no page faults.
}

The ISR queues the DPC. The return value of KeInsertQueueDpc enforces the single-instantiation guarantee: only one instance of a given KDPC can be queued at a time, so queuing it twice before it fires runs the routine once.

// C3 — queue from a mock ISR
BOOLEAN queued = KeInsertQueueDpc(pDpc, Arg1, Arg2);
if (!queued) {
    // Already pending on a queue — the earlier request still stands.
}

Device drivers commonly use the wrapper from inside their InterruptService routine:

// C4 — DpcForIsr pattern
BOOLEAN MyIsr(_In_ PKINTERRUPT Interrupt, _In_ PVOID Context) {
    PDEVICE_OBJECT devObj = (PDEVICE_OBJECT)Context;
    // ...acknowledge hardware quickly...
    IoRequestDpc(devObj, devObj->CurrentIrp, NULL);  // schedules DpcForIsr
    return TRUE;
}

When the processor returns from the interrupt, it checks its DPC queue; if entries are pending, the kernel raises IRQL to DISPATCH_LEVEL, drains the queue by invoking each DeferredRoutine, then lowers IRQL back down.

Flow diagram showing the four-stage DPC lifecycle: allocate KDPC in non-paged pool, initialize with KeInitializeDpc, ISR fires and calls KeInsertQueueDpc, then CPU drains the per-processor queue and executes the DeferredRoutine at DISPATCH_LEVEL — A DPC travels through four stages — allocate, initialize, queue, drain — with the single-instantiation guarantee ensuring each KDPC object fires at most once per queue cycle.

5. Per-Processor DPC Queues and KPRCB

Each logical processor owns a separate DPC queue, stored as a KDPC_DATA structure inside the processor’s KPRCB (Kernel Processor Control Block). This avoids cross-CPU locking on the common path.

KDPC_DATA carries the queue head, depth, count, and a spin lock:

typedef struct _KDPC_DATA {
    LIST_ENTRY DpcListHead;   // queued KDPC objects
    ULONG      DpcLock;       // spin lock protecting the list
    volatile ULONG DpcQueueDepth;  // pending DPCs
    ULONG      DpcCount;      // running total
} KDPC_DATA, *PKDPC_DATA;

Exact KDPC_DATA field names vary by kernel build — confirm against a live PDB with dt nt!_KDPC_DATA before relying on offsets.

Because each queue is per-processor, the target processor of a DPC determines which CPU drains it. By default a DPC runs on the CPU that queued it, but it can be pinned elsewhere (§6) — a property attackers exploit to manipulate specific cores.

Hierarchy diagram showing two CPU KPRCB blocks each owning an independent KDPC_DATA queue structure, with individual KDPC objects enqueued within each per-processor queue to avoid cross-CPU locking — Each logical processor maintains its own KDPC_DATA queue inside its KPRCB, eliminating cross-CPU lock contention on the common interrupt-deferral path.

6. Controlling DPC Behaviour

API Function	Purpose
`KeSetImportanceDpc`	Sets `Importance`; `HighImportance` inserts at the queue head
`KeSetTargetProcessorDpc`	Pins the DPC to a specific logical processor (directed DPC)
`KeRemoveQueueDpc`	Dequeues a pending DPC; fails once the routine is already running

DPCs have three priority levels — low, medium, high. Importance influences KeInsertQueueDpc: high-importance DPCs go to the head of the queue and are serviced first.

A directed DPC is created by binding it to a CPU before queuing. The pattern below — iterating over KeNumberProcessors and targeting each core — is the same primitive a rootkit weaponizes for CPU lockdown, so treat it as an educational illustration only:

// C5 — directed DPC setup (educational pattern)
for (CCHAR cpu = 0; cpu < KeNumberProcessors; cpu++) {
    KeInitializeDpc(&pDpcArray[cpu], MyCustomDpc, NULL);
    KeSetTargetProcessorDpc(&pDpcArray[cpu], cpu);  // pin to logical CPU
    KeSetImportanceDpc(&pDpcArray[cpu], HighImportance);
}

Once a DPC begins executing it cannot be removed; KeRemoveQueueDpc only rescinds a still-pending entry.

7. Threaded DPCs

Since Windows Server 2003, a KDPC can represent either a normal DPC or a threaded DPC. In the threaded variant, the kernel — if it can arrange it — calls the routine back at PASSIVE_LEVEL from a highest-priority thread, allowing more flexible work. Support can be disabled, in which case the threaded DPC falls back to running at DISPATCH_LEVEL exactly like a normal DPC.

You initialize one with KeInitializeThreadedDpc and a CustomThreadedDpc routine. Because that routine can run at either PASSIVE_LEVEL or DISPATCH_LEVEL, it must synchronize correctly at both IRQLs:

// C7 — threaded DPC with dual-IRQL guard
KeInitializeThreadedDpc(&g_ThreadedDpc, MyThreadedDpc, NULL);

VOID MyThreadedDpc(_In_ PKDPC Dpc, _In_opt_ PVOID Ctx,
                   _In_opt_ PVOID A1, _In_opt_ PVOID A2) {
    ASSERT(KeGetCurrentIrql() <= DISPATCH_LEVEL);   // may be PASSIVE or DISPATCH
    // Use locks valid at both levels.
}

Threaded DPCs should be preferred over ordinary DPCs unless a particular DPC must never be preempted — not even by another DPC.

8. Timer DPCs and KTIMER

A DPC is also the callback mechanism for kernel timers. You associate a KDPC with a KTIMER and arm it; on expiry the kernel queues the DPC. KeSetTimerEx supports both one-shot and periodic timers.

// C6 — periodic timer DPC
KeInitializeTimerEx(&g_Timer, NotificationTimer);
KeInitializeDpc(&g_TimerDpc, MyCustomDpc, NULL);

LARGE_INTEGER due;
due.QuadPart = -10LL * 1000 * 1000;     // 1 second, relative
KeSetTimerEx(&g_Timer, due, 1000 /* ms period */, &g_TimerDpc);

Windows uses special timer DPCs internally for timer expiration and context switching. The same primitive — a recurring timer pointed at a non-paged callback — is the cleanest way a driver schedules background work, and the cleanest way a malicious driver re-enters its payload (§10).

9. The DPC Watchdog and Debugging

The kernel runs a DPC watchdog. Bug Check 0x00000133 (DPC_WATCHDOG_VIOLATION) fires when the watchdog detects either a single long-running DPC or a prolonged time spent at DISPATCH_LEVEL or above. The timing budgets are 100 microseconds for a DPC and 25 microseconds for an ISR. A malicious DPC spin-loop can therefore inadvertently trip the watchdog and crash the host.

Inspect live DPC state in the kernel debugger:

kd> !dpcs                 ; list pending DPCs per processor
kd> dt nt!_KDPC           ; KDPC layout for this build
kd> dt nt!_KDPC_DATA      ; per-processor queue structure
kd> !prcb                 ; processor control block (contains DpcData)
kd> !pcr                  ; processor control region

!dpcs reveals each queued DPC’s DeferredRoutine address — the single most useful artifact, since an unknown or non-image-backed routine address is a strong anomaly.

10. Common Attacker Techniques

DPCs give kernel-mode malware a high-IRQL execution surface. Because code at DISPATCH_LEVEL is non-preemptable and runs to completion, it is ideal cover for Direct Kernel Object Manipulation (DKOM).

Technique	Description
CPU lockdown / freeze-other-CPUs	Queue a directed `KDPC` to every non-current CPU via `KeSetTargetProcessorDpc` and spin, raising all secondary cores to `DISPATCH_LEVEL` to block interruption during a DKOM patch
Timer DPC payload	Arm a `KTIMER` whose `DeferredRoutine` points at attacker-controlled non-paged code, for recurring stealth execution
KDPC hijacking	Overwrite `DeferredRoutine` in a legitimate queued `KDPC` to redirect execution to a payload
Driver-based persistence	Load a malicious signed/BYOVD driver that registers a recurring timer DPC at load time

The CPU-lockdown pattern is especially relevant to defenders: by parking every other core at DISPATCH_LEVEL, the rootkit can unlink processes, patch EDR callbacks, or hide drivers while no scheduler or AV thread can run.

Graph diagram mapping three rootkit DPC abuse techniques — directed DPC CPU lockdown, timer DPC stealth re-entry, and DeferredRoutine pointer corruption — to their downstream impacts of DKOM manipulation and EDR callback patching — Kernel rootkits weaponize DPCs three ways: CPU lockdown via directed DPCs, persistent re-entry via timer DPCs, and code hijacking via DeferredRoutine pointer corruption.

11. Defensive Strategies & Detection

DPC objects live entirely in kernel memory and are not directly observable from user mode, so detection focuses on the driver that installs them and on kernel ETW timing telemetry.

Sysmon and Windows event telemetry:

Event ID	Source	Relevance
`6`	Sysmon — Driver Loaded	Fires on every driver load; primary signal for kernel modules that register DPC routines
`7`	Sysmon — Image Loaded	Catches unsigned/anomalous modules entering kernel space
`7045`	Service Control Manager	New kernel-mode driver, especially from a non-standard path
`7040`	Service Control Manager	Service start-type change — driver persistence

ETW providers: The NT Kernel Logger session with EVENT_TRACE_FLAG_DPC and EVENT_TRACE_FLAG_INTERRUPT records per-DPC timing and the routine address, exposing abnormally long-running or unknown-address DPC routines. Microsoft-Windows-Kernel-Processor-Power surfaces IRQL/watchdog events. Verify the exact flag constants against the current WDK evntrace.h.

Sigma anchor — unsigned/expired driver load:

title: Suspicious Kernel Driver Load (Unsigned or Expired)
logsource:
  product: windows
  service: sysmon
detection:
  selection_unsigned:
    EventID: 6
    Signed: 'false'
  selection_expired:
    EventID: 6
    SignatureStatus: 'Expired'
  selection_path:
    EventID: 6
    ImageLoaded|contains: '\Temp\'
  condition: selection_unsigned or selection_expired or selection_path
level: high

Hunt additionally for EventID 6 where ImageLoaded resolves outside \SystemRoot\System32\drivers\.

Hardening:

Mitigation	Description
Driver Signature Enforcement (DSE)	Default on 64-bit Windows; blocks unsigned drivers that would install DPC routines
HVCI	Protects kernel code pages, raising the bar for DPC shellcode and `DeferredRoutine` overwrite
Kernel CET	Hardware shadow stack mitigates ROP-based DPC hijacking
DPC Watchdog	Built-in; Bug Check `0x133` catches long-running DPC loops, including malicious spin-locks
Vulnerable Driver Blocklist	`HKLM\SYSTEM\CurrentControlSet\Control\CI\Config\VulnerableDriverBlocklistEnable` blocks known BYOVD primitives
WDAC / Memory Integrity	Restrict which drivers may load, shrinking the DPC-abuse attack surface

12. Tools for DPC Analysis

Tool	Description	Link
WinDbg	`!dpcs`, `dt nt!_KDPC`, `!prcb`, `!pcr` live queue inspection	microsoft.com
Process Hacker	Driver/service enumeration and kernel module listing	processhacker.sourceforge.io
Windows Performance Recorder / xperf	Captures DPC/ISR ETW timing and routine addresses	microsoft.com
Sysmon	Driver-load (EID 6) and image-load (EID 7) telemetry	sysinternals.com
Volatility	Memory-forensic enumeration of drivers and kernel callbacks	volatilityfoundation.org
Ghidra	Static analysis of suspect drivers for `KeInsertQueueDpc` usage	ghidra-sre.org

13. MITRE ATT&CK Mapping

Technique	MITRE ID	Detection
Rootkit	`T1014`	ETW DPC routine-address anomalies; `!dpcs` unknown routines
Boot/Logon Autostart: Kernel Modules	`T1547.006`	Sysmon EID 6 / Event 7045 driver loads
Exploitation for Privilege Escalation	`T1068`	HVCI/CET violations; `KDPC.DeferredRoutine` corruption
Impair Defenses: Disable/Modify Tools	`T1562.001`	CPU-freeze DPC pattern halting EDR threads; watchdog `0x133`
Native API	`T1106`	Driver use of `KeInitializeDpc` / `KeInsertQueueDpc`

No dedicated ATT&CK sub-technique exists for DPC abuse as of ATT&CK v15; the techniques above are the parents. Verify current IDs at attack.mitre.org before publishing.

Summary

DPCs are the kernel’s mechanism for deferring interrupt work from high-IRQL ISRs down to DISPATCH_LEVEL, keeping ISRs under their 25 µs budget.
The opaque KDPC structure carries the DeferredRoutine, context, arguments, and a DpcData pointer that marks whether it is queued on a per-processor KDPC_DATA list in the KPRCB.
The lifecycle runs allocate → KeInitializeDpc → KeInsertQueueDpc/IoRequestDpc → per-CPU drain at DISPATCH_LEVEL, with a single-instantiation guarantee per object.
Rootkits abuse directed DPCs for CPU lockdown, timer DPCs for stealth re-entry, and DeferredRoutine corruption for hijacking — mapping to T1014, T1547.006, and T1562.001.
Detect via Sysmon Event ID 6 driver loads, NT Kernel Logger DPC timing telemetry, and the DPC watchdog (0x133); harden with DSE, HVCI, Kernel CET, and the vulnerable driver blocklist.

References

IRQL Levels: Interrupt Request Priorities Explained

Objective: Understand the Windows kernel’s Interrupt Request Level (IRQL) priority system — what each level means numerically and symbolically, how the HAL arbitrates hardware and software interrupts, which APIs query and change the IRQL, what kernel operations are legal at each level, and how malicious kernel code abuses IRQL semantics to evade defenders.

1. What Is an IRQL?

An Interrupt Request Level (IRQL) is a per-processor priority value that determines which kernel-mode support routines the currently executing code may legally call. It is an integer in the range 0–31, stored as type KIRQL (a typedef for UCHAR). Three levels — PASSIVE_LEVEL, APC_LEVEL, and DISPATCH_LEVEL — are referred to symbolically; the rest are usually named by value.

IRQL is per-processor, not per-thread. On x86 it lives in the Irql field of the _KPCR (Kernel Processor Control Region); on x64 it is mapped to the CR8 register (Task Priority Register). When the processor raises its IRQL, all interrupts at or below that level are masked. Higher-numbered interrupts preempt all lower-IRQL processing; once handled, the processor returns to the previous level. Raising and lowering must follow strict stack discipline — you only lower back to a level you previously raised from.

2. The IRQL Hierarchy

The Hardware Abstraction Layer (HAL) maps physical interrupt vectors to software IRQLs. The count of levels is architecture-dependent: x64 and Itanium expose 16 IRQLs; x86 exposes 32, owing to differences in interrupt-controller hardware. The canonical wdm.h symbolic definitions differ across architectures.

Symbolic Name	x64 Value	x86 Value	Description
`PASSIVE_LEVEL` / `LOW_LEVEL`	0	0	Normal thread execution; nothing masked
`APC_LEVEL`	1	1	APC delivery and page-fault handling
`DISPATCH_LEVEL`	2	2	Thread scheduler / DPC queue
`CMC_LEVEL`	3	—	Correctable Machine Check
Device IRQLs (DIRQL)	4–11	3–26	Hardware device interrupts
`CLOCK_LEVEL`	13	28	System clock timer
`IPI_LEVEL` / `DRS_LEVEL`	14	29	Inter-Processor Interrupt
`POWER_LEVEL`	15	30	Power failure
`PROFILE_LEVEL` / `HIGH_LEVEL`	15	31	Profiling / highest maskable

Higher value = higher priority. A device interrupt at DIRQL 8 preempts a DPC at DISPATCH_LEVEL (2), which itself preempts ordinary thread code at PASSIVE_LEVEL (0).

Hierarchical diagram showing Windows IRQL levels from HIGH_LEVEL at the top down to PASSIVE_LEVEL at the bottom, colour-coded by hardware versus software IRQLs — Windows x64 IRQL hierarchy: higher-numbered levels preempt all lower ones, with software IRQLs at the base and hardware interrupt levels at the top.

3. Software IRQLs: PASSIVE, APC, and DISPATCH

The lowest three levels are software IRQLs — the kernel raises and lowers them without involving the interrupt controller.

PASSIVE_LEVEL (0) masks nothing. This is where normal kernel-mode thread code runs: DriverEntry, AddDevice, Unload, most dispatch routines, and driver-created worker threads. All blocking, paging, and synchronization primitives are available.

APC_LEVEL (1) masks Asynchronous Procedure Call interrupts only. The sole functional difference from PASSIVE_LEVEL is that APCs cannot interrupt the running code. Both levels imply a valid thread context and both permit access to pageable memory. Page-fault handling itself runs at APC_LEVEL.

DISPATCH_LEVEL (2) masks DISPATCH_LEVEL and APC_LEVEL. Critically, the thread scheduler is disabled — code here owns the processor until it lowers IRQL. Routines such as StartIo, DpcForIsr, IoTimer, Cancel (holding the cancel spin lock), and all DPC callbacks run here. Two hard rules apply: no access to paged memory, and no blocking waits.

Feature	PASSIVE_LEVEL	APC_LEVEL	DISPATCH_LEVEL
Thread context	Yes	Yes	Not guaranteed
Scheduler active	Yes	Yes	No
Paged pool access	Yes	Yes	No
Blocking waits allowed	Yes	Yes	No

4. Hardware IRQLs: DIRQL and Above

Levels at or above the device range are hardware IRQLs driven by the interrupt controller. A driver’s Device IRQL (DIRQL) is the SynchronizeIrql stored in its _KINTERRUPT object. When a device fires, the processor raises to that DIRQL and invokes the Interrupt Service Routine (ISR), a KSERVICE_ROUTINE.

At DIRQL, all interrupts at or below the driver’s level are masked, but higher-DIRQL devices, the clock, and power-failure interrupts may still preempt. Because the scheduler and lower-priority interrupts are blocked, ISRs must be minimal — they acknowledge the hardware, capture volatile state, and queue a DPC for the heavy lifting at DISPATCH_LEVEL.

Above DIRQL sit CLOCK_LEVEL, IPI_LEVEL (used by one processor to interrupt another), POWER_LEVEL, and HIGH_LEVEL. The general principle: the higher the IRQL, the shorter the code must run. Sustained work at high IRQL starves the entire processor.

// KSERVICE_ROUTINE - runs at DIRQL; must be minimal
BOOLEAN MyInterruptServiceRoutine(
    PKINTERRUPT Interrupt, PVOID ServiceContext) {
    // Acknowledge hardware, then defer heavy work to a DPC.
    // Do NOT touch paged memory here.
    IoRequestDpc(MyDeviceObject, MyDeviceObject->CurrentIrp, ServiceContext);
    return TRUE;
}

5. Kernel APIs for IRQL Management

Drivers query and adjust IRQL through a small, exported API surface in wdm.h.

API Function	Purpose
`KeGetCurrentIrql()`	Returns the current processor IRQL; callable at any IRQL
`KeRaiseIrql(NewIrql, &OldIrql)`	Raises to `NewIrql`; saves prior level. `NewIrql` must be ≥ current
`KeLowerIrql(OldIrql)`	Restores a previously saved IRQL — only after a matching raise
`KeRaiseIrqlToDpcLevel()`	Raises to `DISPATCH_LEVEL`, returns old IRQL
`KeAcquireSpinLock(&Lock, &OldIrql)`	Acquires spin lock, raising to `DISPATCH_LEVEL`
`KeReleaseSpinLock(&Lock, OldIrql)`	Releases lock, restoring saved IRQL
`KeAcquireSpinLockAtDpcLevel(&Lock)`	Acquires lock without raising (caller already at `DISPATCH_LEVEL`)

The exact signatures:

KIRQL KeGetCurrentIrql(void);

void KeRaiseIrql(
  _In_  KIRQL  NewIrql,
  _Out_ PKIRQL OldIrql
);

void KeLowerIrql(_In_ KIRQL NewIrql);   // restore saved old IRQL

KIRQL KeRaiseIrqlToDpcLevel(void);

The raise/lower discipline is enforced: calling KeRaiseIrql with a value lower than the current IRQL is a fatal error, and KeLowerIrql may only restore the level a prior KeRaiseIrql saved.

// Demonstrates the raise/lower stack discipline
VOID MyFunctionNeedingDispatchLevel(VOID) {
    KIRQL oldIrql;
    KeRaiseIrql(DISPATCH_LEVEL, &oldIrql);
    // --- Critical section: no paged pool access here ---
    KeLowerIrql(oldIrql);
}

Spin locks couple mutual exclusion with IRQL: acquiring one raises to DISPATCH_LEVEL so the holder cannot be preempted by the scheduler on its processor.

KSPIN_LOCK MySpinLock;
KIRQL oldIrql;

KeInitializeSpinLock(&MySpinLock);
// KeAcquireSpinLock raises to DISPATCH_LEVEL internally
KeAcquireSpinLock(&MySpinLock, &oldIrql);
// ... protected shared-data access (non-paged only) ...
KeReleaseSpinLock(&MySpinLock, oldIrql); // restores oldIrql

A driver inspecting its own context queries the level directly:

// Demonstrates KeGetCurrentIrql() usage and KIRQL type
NTSTATUS DriverDispatchCreate(PDEVICE_OBJECT DeviceObject, PIRP Irp) {
    KIRQL currentIrql = KeGetCurrentIrql();
    // Expected: PASSIVE_LEVEL (0) in a dispatch routine
    DbgPrint("[MyDriver] Current IRQL: %u\n", (ULONG)currentIrql);
    // ...complete IRP...
}

6. Memory Access Rules at Each IRQL

The single most consequential IRQL rule concerns paged memory. Any routine running above APC_LEVEL that touches paged pool causes a fatal page fault. Resolving a page fault requires the file-system driver to read from disk — an operation that needs a context switch, which is impossible once the scheduler is disabled at DISPATCH_LEVEL.

Memory Pool	PASSIVE_LEVEL	APC_LEVEL	DISPATCH_LEVEL+
Paged pool	Accessible	Accessible	Fatal page fault
Non-paged pool	Accessible	Accessible	Accessible

Code at or above DISPATCH_LEVEL must therefore allocate from non-paged pool and operate only on locked or non-pageable memory (for example, buffers locked with MmProbeAndLockPages). Violating this rule produces the most common driver bug check — IRQL_NOT_LESS_OR_EQUAL (0x0000000A), or its driver-attributed variant 0x000000D1.

7. DPCs: The DISPATCH_LEVEL Workhorses

A Deferred Procedure Call (DPC) moves work out of the time-critical ISR into DISPATCH_LEVEL. The ISR queues a _KDPC object (via IoRequestDpc or KeInsertQueueDpc); the kernel drains the DPC queue as IRQL drops below DISPATCH_LEVEL. DpcForIsr handles per-IRP completion; CustomDpc and CustomTimerDpc serve driver-specific needs.

// KDEFERRED_ROUTINE - runs at DISPATCH_LEVEL
VOID MyDpcRoutine(
    PKDPC Dpc, PVOID DeferredContext,
    PVOID SystemArgument1, PVOID SystemArgument2) {
    // Safe: non-paged pool only.
    // Do NOT call KeWaitForSingleObject with a nonzero timeout.
    DbgPrint("[MyDpc] Running at DISPATCH_LEVEL\n");
}

A DPC that runs too long throttles the whole system and triggers DPC_WATCHDOG_VIOLATION (0x00000133) once sustained execution exceeds the watchdog threshold.

Flow diagram illustrating the handoff from a hardware interrupt through the ISR at DIRQL to a queued DPC callback executing at DISPATCH_LEVEL 2 — ISRs acknowledge hardware and queue a DPC object; the kernel drains DPC queues at DISPATCH_LEVEL so heavy processing never blocks critical interrupt handling.

8. APCs: The APC_LEVEL Mechanism

An Asynchronous Procedure Call (APC) executes a function in the context of a specific thread. Kernel APCs run at APC_LEVEL; user APCs are delivered when a thread returns to PASSIVE_LEVEL in a user-mode alertable wait. Drivers initialize them with KeInitializeApc and queue them with KeInsertQueueApc. Because APC_LEVEL still implies a valid thread context and permits paged access, certain dispatch routines raise to APC_LEVEL to serialize against APC delivery while remaining able to page in data.

9. Debugging IRQL With WinDbg

WinDbg exposes IRQL state on both live kernels and crash dumps.

; Check current IRQL on each processor
!irql

; Examine the KPCR for processor 0
!pcr 0

; List pending DPCs
!dpcs

; Analyze a 0x0000000A bugcheck
!analyze -v

On x64 the IRQL is the CR8 register; you can read it and the _KPCR directly:

; dt = display type; shows _KPCR struct at GS base
dt nt!_KPCR @$pcr
; On x64, IRQL maps to CR8 (Task Priority Register)
r cr8

The IRQL contract is also expressed statically through SAL annotations in wdm.h, which static-analysis tooling verifies at build time:

// Illustrates IRQL annotation macros from wdm.h
_IRQL_requires_max_(DISPATCH_LEVEL)
VOID MyRoutineSafeAtOrBelowDispatch(VOID);

_IRQL_requires_(PASSIVE_LEVEL)
VOID MyRoutineRequiresPassive(VOID);

_IRQL_raises_(DISPATCH_LEVEL)
_IRQL_saves_
KIRQL MyRaiseRoutine(VOID);

10. IRQL in a Security Context

IRQL semantics become a security concern the moment attacker code reaches ring 0. Code running at DISPATCH_LEVEL owns its processor and is invisible to user-mode EDR hooks — an ideal vantage point for unhooking the SSDT, overwriting kernel callbacks, or hiding objects before defensive software can react. Because paged access above APC_LEVEL is fatal, IRQL violations also serve as a crude denial-of-service primitive: a single bad page touch produces an IRQL_NOT_LESS_OR_EQUAL blue screen.

The dominant delivery vector is Bring Your Own Vulnerable Driver (BYOVD) — loading a legitimately signed but exploitable driver to obtain kernel-IRQL execution without writing a new signed driver. Missing or incorrect IRQL SAL annotations frequently mask the very bugs these attacks exploit.

Flow diagram showing a BYOVD attack path from loading a vulnerable signed driver through raising IRQL to DISPATCH_LEVEL to bypass EDR hooks or trigger a denial-of-service blue screen — Attackers exploit IRQL semantics via BYOVD: owning the processor at DISPATCH_LEVEL lets them silently unhook defenses or weaponize paged-memory violations as a kernel-mode DoS.

11. Common Attacker Techniques

Technique	Description
BYOVD kernel execution	Load a signed-but-vulnerable driver (e.g. `RTCore64.sys`, `dbutil_2_3.sys`) to run code at kernel IRQL
EDR unhooking at `DISPATCH_LEVEL`	Patch SSDT entries or kernel callbacks while the scheduler is disabled, beating re-hook races
Rootkit concealment	Hide processes, files, and connections from DIRQL/`DISPATCH_LEVEL`, below user-mode visibility
Spin-lock starvation	Hold a spin lock at `DISPATCH_LEVEL` to monopolize a processor — driver-stack DoS
Deliberate IRQL fault	Force paged access above `APC_LEVEL` to bug-check the host (`0x0000000A` DoS)
DSE downgrade	Flip test-signing or pre-release flags to load unsigned kernel code

12. Defensive Strategies & Detection

Driver loads are the chokepoint. Sysmon Event ID 6 (Driver Loaded) records ImageLoaded, Hashes, Signed, Signature, and SignatureStatus — the fields that expose unsigned or anomalously signed drivers and known-vulnerable BYOVD payloads. Event ID 7045 (and System log 7036/7040) surface drivers registered as services. PatchGuard violations of _KPCR/IDT/SSDT raise bug check 0x00000109 (CRITICAL_STRUCTURE_CORRUPTION); HVCI/Code-Integrity blocks land in Microsoft-Windows-CodeIntegrity/Operational (Event IDs 3001–3089) and Security Event ID 5038.

A starting Sigma rule for vulnerable-driver loads:

title: Suspicious Vulnerable Driver Load (Possible BYOVD)
logsource:
  product: windows
  service: sysmon
detection:
  selection_unsigned:
    EventID: 6
    Signed: 'false'
  selection_known_vuln:
    EventID: 6
    ImageLoaded|endswith:
      - '\RTCore64.sys'
      - '\dbutil_2_3.sys'
  condition: selection_unsigned or selection_known_vuln
level: high

ISR/DPC behavior can be traced through the NT Kernel Logger ETW provider with interrupt and DPC flags enabled:

xperf -on Base+Interrupt+DPC
xperf -d trace.etl

Hardening layers: enforce Driver Signature Enforcement and HVCI (M1048) so unsigned or tampered drivers cannot load even on a compromised kernel; enable the Microsoft Vulnerable Driver Blocklist (HKLM\SYSTEM\CurrentControlSet\Control\CI\Config\VulnerableDriverBlocklistEnable); restrict SeLoadDriverPrivilege to administrators (M1026); and run suspect drivers under Driver Verifier in a VM to force IRQL checks. Monitor bcdedit test-signing changes and the CI\Config registry path for downgrade attempts.

MITRE ATT&CK Mapping

Technique	MITRE ID	Detection
Rootkit	`T1014`	Sysmon EID 6 unsigned/anomalous drivers; HVCI logs
Create System Process: Service	`T1543.003`	EID 7045 / System 7036 driver-service install
Impair Defenses: Disable Tools	`T1562.001`	EDR callback integrity, PatchGuard `0x109`
Impair Defenses: Downgrade	`T1562.010`	`CI\Config` registry + `bcdedit` test-signing audit
Exploitation for Priv-Esc	`T1068`	BYOVD load (EID 6) preceding kernel-write activity
Escape to Host	`T1611`	Kernel-IRQL execution from container context

13. Tools for IRQL Analysis

Tool	Description	Link
WinDbg	`!irql`, `!pcr`, `!dpcs`, `!analyze -v` on bug checks	microsoft.com
Driver Verifier	Forces IRQL/pool/deadlock checks on a target driver	microsoft.com
Sysmon	Driver-load (EID 6) and service (7045) telemetry	microsoft.com
xperf / WPA	ETW interrupt and DPC tracing	microsoft.com
Process Hacker	Live driver and kernel-module enumeration	processhacker.sourceforge.io
Volatility	Memory-forensic driver and callback inspection	volatilityfoundation.org
Ghidra	Static analysis of suspect driver binaries	ghidra-sre.org

Summary

IRQL is a per-processor priority register that gates which kernel routines code may legally call and which interrupts are masked.
The HAL maps hardware vectors onto 16 IRQLs on x64 and 32 on x86; higher value preempts lower, and raising/lowering must follow strict stack discipline.
Above APC_LEVEL the scheduler is disabled and paged memory is off-limits — touching it triggers IRQL_NOT_LESS_OR_EQUAL (0x0000000A).
Attackers reach kernel IRQL through BYOVD to unhook EDR, conceal rootkits, or bug-check the host as a DoS — mapped to T1014, T1543.003, T1562.001, and T1068.
Detect via Sysmon Event ID 6, the vulnerable-driver blocklist, HVCI/DSE enforcement, and SeLoadDriverPrivilege restriction.

References

System Calls and SSDT: How User Mode Reaches the Kernel

Objective: Understand how Windows user-mode code transitions to ring 0 via the SYSCALL instruction, how the System Service Descriptor Table (SSDT) dispatches those calls, and why SSDT hooking, direct syscalls, and modern kernel hardening (PatchGuard, HVCI, MWTI ETW) are central to both offensive tradecraft and defensive telemetry.

1. Why System Calls Exist

User-mode code runs at CPL 3 (ring 3). The kernel runs at CPL 0 (ring 0). Privileged operations — opening another process, mapping physical pages, accessing the file system, talking to drivers — require ring 0. The CPU enforces this with segment descriptors and page-table permissions; a direct CALL into kernel memory from user mode faults immediately.

The bridge is a controlled transition: the user-mode side specifies what it wants by number, the CPU switches to ring 0 at a fixed, kernel-controlled entry point, and the kernel validates and dispatches. That number is the System Service Number (SSN), and the dispatch table is the SSDT.

This design has two consequences that drive everything in this post:

The kernel entry point is fixed and well-known, so an attacker who can write to ring 0 memory (a kernel rootkit) can redirect every syscall by patching one table.
The user-mode side of the syscall (the stub in ntdll.dll) is not privileged, so an EDR can hook it — and a red teamer can bypass that hook by issuing the SYSCALL instruction themselves.

2. The Mechanics of `SYSCALL` on x64

SYSCALL is a dedicated x86-64 instruction designed for fast ring-3 → ring-0 transitions. It does not use the legacy interrupt gate (int 2Eh); it reads MSRs and jumps.

MSR	Address	Role
`IA32_LSTAR`	`0xC0000082`	Kernel `RIP` to jump to on `SYSCALL` from 64-bit user mode. Holds `KiSystemCall64` (or `KiSystemCall64Shadow` with KPTI).
`IA32_STAR`	`0xC0000081`	Encodes the kernel and user `CS`/`SS` selectors for `SYSCALL`/`SYSRET`.
`IA32_FMASK`	`0xC0000084`	`RFLAGS` mask — bits cleared on entry (notably `IF`, masking interrupts during the prologue).

The x64 Windows syscall ABI:

EAX holds the SSN (the index into KiServiceTable).
R10 holds the first argument. The user-mode stub copies RCX into R10 because SYSCALL itself clobbers RCX with the return RIP.
RDX, R8, R9, then stack — match the standard x64 calling convention for the remaining arguments.

A minimal user-mode stub, exactly as ntdll lays it out:

; NtFooBar — illustrative ntdll-style syscall stub (x64)
NtFooBar:
    mov   r10, rcx          ; SYSCALL clobbers RCX; preserve arg0 in R10
    mov   eax, 0x????       ; SSN — VERSION-SPECIFIC, resolve at runtime
    syscall                 ; ring-3 -> ring-0 via LSTAR
    ret                     ; SYSRET returns here

The 32-bit predecessor was SYSENTER (with entry stored in IA32_SYSENTER_EIP). On modern 64-bit Windows, SYSENTER is only relevant inside the Wow64 path.

Flow diagram showing the sequence from user-mode code through the ntdll SYSCALL stub, CPU MSR-driven transition, KiSystemCall64 kernel entry point, SSDT dispatch, and final Nt* function execution — A single SYSCALL instruction bridges ring 3 and ring 0, with EAX carrying the SSN that indexes KiServiceTable for dispatch.

3. `KiSystemCall64`: The Kernel Entry Point

When the CPU executes SYSCALL from user mode:

It loads RIP from IA32_LSTAR (→ KiSystemCall64).
It loads CS/SS from IA32_STAR (kernel selectors).
It saves the old user RIP in RCX and old RFLAGS in R11.
It clears RFLAGS bits per IA32_FMASK.

KiSystemCall64 then:

Swaps GS via SWAPGS to access the per-CPU KPCR.
Switches from the user stack to the kernel stack stored in the KPCR.
Builds a KTRAP_FRAME capturing the user context.
Indexes KeServiceDescriptorTable (or the Shadow variant for Win32k GUI calls) using EAX.
Calls the resolved Nt* function.
On return, restores the frame and executes SYSRET to drop back to ring 3.

Selected KTRAP_FRAME fields (see WDK wdm.h for the full layout):

Field	Description
`Rip`	Saved user-mode instruction pointer (from `RCX` at entry).
`Rsp`	Saved user-mode stack pointer.
`EFlags`	Saved `RFLAGS` (from `R11`).
`ErrCode`	Processor error code; `0` for syscalls.

With Kernel Page-Table Isolation (KPTI) active, IA32_LSTAR points instead at KiSystemCall64Shadow, a thin trampoline that swaps from the user CR3 (which maps only a minimal kernel trampoline) to the full kernel CR3 before falling through into the normal dispatcher. This is the Meltdown mitigation.

4. The SSDT and `KSERVICE_TABLE_DESCRIPTOR`

The “SSDT” in casual use refers to two related objects:

Symbol	Description
`KeServiceDescriptorTable`	Exported `KSERVICE_TABLE_DESCRIPTOR`. Covers the core `Nt*` services in `ntoskrnl.exe`.
`KeServiceDescriptorTableShadow`	Not exported. Adds a second entry for `win32k!W32pServiceTable` — the GUI/USER/GDI syscall surface. Rootkits historically located it by pattern scanning around `KeAddSystemServiceTable` or via debugger symbols.
`KiServiceTable`	The actual function-pointer table referenced by the descriptor.
`KiArgumentTable`	Parallel array of argument byte counts per service.

Approximate layout from public symbols:

typedef struct _KSERVICE_TABLE_DESCRIPTOR {
    PULONG_PTR ServiceTable;   // -> KiServiceTable (encoded offsets on x64)
    PULONG     CounterTable;   // call counters (typically NULL in retail)
    ULONG      TableSize;      // number of services
    PUCHAR     ArgumentTable;  // bytes of stack args per service
} KSERVICE_TABLE_DESCRIPTOR, *PKSERVICE_TABLE_DESCRIPTOR;

The SSN (EAX) is split: the low 12 bits index the table, and bit 12 selects which descriptor — 0 for KeServiceDescriptorTable, 1 for the Win32k shadow table. This is how GUI syscalls (NtUserCreateWindowEx, NtGdiBitBlt, …) coexist with kernel-proper syscalls in the same SSN space.

Hierarchy diagram showing KeServiceDescriptorTable splitting into the core NT KiServiceTable and the Win32k shadow table, with EAX bit 12 selecting the descriptor and low 12 bits indexing into it — EAX bit 12 routes GUI syscalls to the Win32k shadow table while bits 11–0 index the specific service within the selected descriptor.

5. The x64 Encoded-Offset Format

A critical detail anyone writing an SSDT scanner gets wrong the first time: on x64 Windows, KiServiceTable entries are not function pointers. Each entry is a 32-bit value encoding a signed offset from the base of KiServiceTable itself, with the low 4 bits used to communicate the argument-count category to the dispatcher.

The decode is:

// Recover the real Nt* function address from KiServiceTable[i]
ULONG_PTR DecodeSsdtEntry(PULONG ServiceTable, ULONG index)
{
    LONG  encoded = (LONG)ServiceTable[index];     // signed 32-bit
    LONG  offset  = encoded >> 4;                  // arithmetic shift
    return (ULONG_PTR)ServiceTable + offset;       // base + offset
}

The arithmetic right shift matters — it preserves the sign, allowing functions located before KiServiceTable in memory to be addressed. A naive unsigned >> 4 will silently miss those entries and produce a corrupt scanner.

6. Tracing a Syscall End-to-End: `NtOpenProcess`

Following an OpenProcess call from a user-mode debugger target:

kernel32!OpenProcess
   └─> kernelbase!OpenProcess
        └─> ntdll!NtOpenProcess         ; the syscall stub
              mov  r10, rcx
              mov  eax, <SSN>           ; version-specific
              syscall
              ret
            ─────────── ring 3 / ring 0 boundary ───────────
            CPU: RIP <- LSTAR (KiSystemCall64[Shadow])
        nt!KiSystemCall64
          ├─ SWAPGS, switch to kernel stack
          ├─ build KTRAP_FRAME
          ├─ idx = EAX & 0xFFF
          ├─ desc = (EAX & 0x1000) ? Shadow : KeServiceDescriptorTable
          ├─ fn  = desc->ServiceTable + (desc->ServiceTable[idx] >> 4)
          └─ call nt!NtOpenProcess
                nt!NtOpenProcess
                  ├─ ObReferenceObjectByName / ByHandle
                  ├─ SeAccessCheck (DesiredAccess vs token)
                  └─ ObOpenObjectByPointer -> HANDLE
            SYSRET back to user-mode RIP saved in RCX

The SSN for NtOpenProcess changes between Windows builds; never hardcode it. Tooling either resolves it from the on-disk ntdll.dll, parses the in-memory stub, or consults a versioned table such as j00ru’s syscall reference.

A practical SSN extractor parses the Nt* export’s first instructions and reads the MOV EAX, imm32 (B8 xx xx xx xx) byte pattern:

# Parse SSNs from a clean on-disk ntdll.dll (illustrative)
import pefile, struct

pe = pefile.PE(r"C:\Windows\System32\ntdll.dll", fast_load=False)
pe.parse_data_directories()
image = pe.get_memory_mapped_image()

for exp in pe.DIRECTORY_ENTRY_EXPORT.symbols:
    name = exp.name.decode() if exp.name else ""
    if not name.startswith("Nt"):
        continue
    stub = image[exp.address: exp.address + 24]
    # Classic stub: 4C 8B D1  B8 ss ss 00 00  F6 04 25 ...  0F 05  C3
    if stub[0:3] == b"\x4c\x8b\xd1" and stub[3] == 0xB8:
        ssn = struct.unpack("<I", stub[4:8])[0]
        print(f"{name:40s} SSN=0x{ssn:04x}")

Red-team loaders use the same idea at runtime — sometimes against a fresh copy of ntdll read from disk to defeat in-memory EDR hooks (the “Perun’s Fart” / fresh-copy pattern).

7. Wow64 and Heaven’s Gate

A 32-bit process on 64-bit Windows still ultimately issues a 64-bit SYSCALL, because the only kernel entry the CPU honors from a 64-bit process is KiSystemCall64. The Wow64 layer bridges this:

32-bit app -> wow64cpu!CpupReturnFromSimulatedCode
           -> far jmp 0x33:<addr>          ; CS=0x23 (32-bit) -> CS=0x33 (64-bit)
           -> wow64.dll / 64-bit ntdll
           -> SYSCALL

The 0x33 / 0x23 CS selector switch is the so-called Heaven’s Gate (community label, not an official Microsoft term). Malware abuses it to:

Execute 64-bit shellcode from a process that defenders are monitoring as a 32-bit target.
Issue syscalls that bypass 32-bit ntdll hooks if the EDR only instruments the Wow64 layer.

Analysts should treat any unexpected far jmp to CS=0x33 in 32-bit code as a strong IOC.

8. SSDT Hooking: The Classic Rootkit Technique

Pre-Vista x64, kernel rootkits manipulated KiServiceTable directly:

Locate the descriptor (KeServiceDescriptorTable is exported; the Shadow descriptor was pattern-scanned).
Disable write protection (clear CR0.WP) or remap the page as writable.
Save the original entry for the target SSN (e.g., NtQueryDirectoryFile, NtEnumerateValueKey).
Overwrite the entry with a pointer to attacker code.
The hook calls the original after filtering results — hiding files, registry keys, processes, or network connections.

The illustrative read-only inspection (do not modify) inside a signed test driver:

extern PKSERVICE_TABLE_DESCRIPTOR KeServiceDescriptorTable;

VOID DumpSsdtSizeAndSample(VOID)
{
    PKSERVICE_TABLE_DESCRIPTOR d = KeServiceDescriptorTable;
    PULONG table = (PULONG)d->ServiceTable;

    DbgPrint("[SSDT] TableSize = %lu\n", d->TableSize);

    for (ULONG i = 0; i < 4 && i < d->TableSize; i++) {
        LONG      enc  = (LONG)table[i];
        ULONG_PTR addr = (ULONG_PTR)table + (enc >> 4);
        DbgPrint("[SSDT] [%lu] encoded=0x%08x -> 0x%p\n", i, enc, (PVOID)addr);
    }
}

// Reading LSTAR to confirm KiSystemCall64[Shadow]
VOID DumpLstar(VOID)
{
    ULONG64 lstar = __readmsr(0xC0000082);
    DbgPrint("[MSR] IA32_LSTAR = 0x%llx (KiSystemCall64[Shadow])\n", lstar);
}

Live inspection from WinDbg on a kernel-debugged target:

0: kd> dt nt!_KSERVICE_TABLE_DESCRIPTOR nt!KeServiceDescriptorTable
0: kd> dq  nt!KeServiceDescriptorTable L4
0: kd> dd  nt!KiServiceTable L20
0: kd> u   poi(nt!KiServiceTable) L5
0: kd> rdmsr c0000082

9. PatchGuard (KPP) and Why SSDT Hooking Died

Since x64 Vista, Kernel Patch Protection periodically validates a set of protected structures, including KiServiceTable, IDT, GDT, MSR_LSTAR, kernel image code sections, and several driver objects. On mismatch, KPP issues bugcheck 0x109 — CRITICAL_STRUCTURE_CORRUPTION. The checks run from randomized timers and contexts to resist disablement.

The practical result:

SSDT hooking is no longer a viable persistence or hiding primitive on supported 64-bit Windows. Any survival window is short and ends in a BSOD.
Modern kernel-mode attackers use driver callbacks (PsSetCreateProcessNotifyRoutine, ObRegisterCallbacks, minifilters) rather than SSDT patching, because those are the supported extension points and are not policed by KPP.
With HVCI/Memory Integrity enabled, even loading the malicious driver is gated: kernel pages cannot be both writable and executable, and unsigned kernel code cannot enter ring 0 at all. The hypervisor enforces this at the EPT level — PatchGuard becomes a second line, not the first.

10. Direct and Indirect Syscalls (Modern Red Team TTPs)

Because KPP closed the kernel-side door, evasion moved into user mode. Many EDRs hook the Nt* stubs in ntdll.dll by overwriting the first bytes with a JMP into their inspection DLL. Two techniques bypass that:

Direct syscalls. The loader embeds its own mov eax, ssn; syscall; ret stub in attacker memory and calls it instead of ntdll!NtXxx. The hooked ntdll is never touched. SSNs are resolved at runtime (parsing ntdll, sorting Nt* exports by address — the “Hell’s Gate” / “Halo’s Gate” patterns).
Indirect syscalls. The mov eax, ssn happens in attacker memory, but the syscall instruction itself is reached by jumping to the syscall byte sequence inside ntdll.dll. The kernel-side return address therefore points back into ntdll, matching what legitimate code looks like in stack-walk telemetry.

The detection signal flips between the two:

Technique	What it bypasses	What still sees it
Direct syscall	ntdll user-mode hooks	Stack walk shows `syscall` from unbacked / private memory.
Indirect syscall	ntdll hooks and naive stack-walk checks	Kernel ETW (`Microsoft-Windows-Threat-Intelligence`) sees the syscall regardless of where it was issued from.

ETW-TI is the answer to indirect syscalls: it fires from inside the kernel dispatcher, after the SYSCALL has already landed in KiSystemCall64, so the user-mode evasion is irrelevant.

Graph diagram contrasting direct and indirect syscall evasion paths against EDR user-mode hooks, Sysmon CallTrace detection, and kernel-level ETW-TI telemetry firing after the syscall transition — Direct syscalls skip ntdll entirely while indirect syscalls camouflage the return address; ETW-TI catches both because it fires inside the kernel after the ring transition.

11. Common Attacker Techniques

Technique	Description
SSDT hook (legacy)	Overwrite `KiServiceTable[SSN]` to filter results for hiding rootkit artifacts; killed by PatchGuard on x64.
Shadow SSDT hook	Same against `W32pServiceTable` to intercept GUI/keyboard/clipboard syscalls.
Direct syscall stub	Embedded `mov eax, ssn; syscall` in attacker memory to bypass ntdll hooks.
Indirect syscall	Jump to the `syscall` gadget inside ntdll so call stacks look legitimate.
Hell’s Gate / Halo’s Gate	Runtime SSN resolution by parsing/sorting `Nt*` exports in mapped ntdll.
Fresh-copy ntdll	Read clean `ntdll.dll` from disk to re-derive unhooked stubs and SSNs.
Heaven’s Gate	Far jump from 32-bit (`CS=0x23`) to 64-bit (`CS=0x33`) to execute 64-bit syscalls from a Wow64 process.
Driver-based hooking	Where HVCI is off, signed-but-vulnerable drivers (“BYOVD”) are used to write to MSRs or protected pages.

12. Defensive Strategies & Detection

The detection model has shifted from “watch the SSDT” (PatchGuard already does that) to watch how syscalls are issued from user mode and consume kernel ETW.

Sysmon

Event ID	Field	Why it matters
`1`	`ParentImage`, `CommandLine`	Baseline; correlates injection target lineage.
`10`	`GrantedAccess`, `CallTrace`	The `CallTrace` field is the primary direct-syscall tell — legitimate stacks contain `ntdll.dll`; direct syscalls show `UNKNOWN(...)` or RWX private memory regions.
`25`	—	Process image tampering / hollowing.

Sigma — direct-syscall `NtOpenProcess` against LSASS

title: Process Access to LSASS via Direct Syscall (Unbacked Call Stack)
id: 8d0c2a4e-syscall-lsass-unbacked
status: experimental
logsource:
  product: windows
  service: sysmon
detection:
  selection:
    EventID: 10
    TargetImage|endswith: '\lsass.exe'
    GrantedAccess:
      - '0x1010'
      - '0x1410'
      - '0x1fffff'
  unbacked:
    CallTrace|contains:
      - 'UNKNOWN'
      - 'UNKNOWN('
  filter_legit:
    SourceImage|endswith:
      - '\MsMpEng.exe'
      - '\MsSense.exe'
  condition: selection and unbacked and not filter_legit
level: high
tags:
  - attack.credential_access
  - attack.t1003.001
  - attack.t1106

ETW Providers Worth Subscribing To

Provider	Use
`Microsoft-Windows-Threat-Intelligence`	Kernel ETW provider exposing `AllocVm`, `ProtectVm`, `MapViewOfSection`, `ReadVm`/`WriteVm` events. Fires from inside the kernel dispatcher, so direct and indirect syscalls are still visible. Consumer must run as PPL.
`Microsoft-Windows-Kernel-Process`	Process and thread creation, image loads.
`Microsoft-Windows-Kernel-Audit-API-Calls`	Audits selected Nt API calls (verify against current SDK).

Audit Policy

Audit Sensitive Privilege Use — catches SeDebugPrivilege enabling, a near-universal precursor to syscall-based cross-process injection.
Audit Process Creation with command-line capture.
Audit Handle Manipulation with object SACLs on lsass.exe.

Hardening

HVCI / Memory Integrity — single highest-value control. Blocks unsigned and W^X-violating kernel code; defeats BYOVD primitives that try to disable PatchGuard, patch the SSDT, or clear CR0.WP.
VBS + Credential Guard — keeps LSASS secrets off the path even if a syscall reaches NtOpenProcess.
KPTI — Meltdown mitigation; also implies KiSystemCall64Shadow is the LSTAR target.
Driver Signature Enforcement + Microsoft vulnerable-driver blocklist — limits BYOVD options.
EDR ntdll instrumentation — still valuable as a low-cost filter against commodity malware; layer with kernel ETW for the sophisticated cases.

13. Tools for Syscall and SSDT Analysis

Tool	Description	Link
WinDbg	Kernel debugger; resolves `nt!KeServiceDescriptorTable`, `nt!KiServiceTable`, reads MSRs via `rdmsr`.	learn.microsoft.com
Process Hacker	Live handle, thread, and module inspection; surfaces RWX private memory regions.	processhacker.sourceforge.io
Process Monitor	Boot-time and runtime `Nt*` activity captured via minifilter.	learn.microsoft.com
SysmonView / Sysmon	EID 10 `CallTrace`, EID 25 telemetry.	learn.microsoft.com
HollowsHunter / pe-sieve	Detects unbacked / hollowed / patched modules — strong correlator for direct-syscall loaders.	github.com/hasherezade
SwishDbgExt	WinDbg extension with SSDT dumping and decode of the encoded-offset format.	github.com
Volatility 3	Memory forensics; `windows.ssdt` plugin walks the descriptor and decodes entries.	volatilityfoundation.org
j00ru syscall tables	Authoritative per-version SSN reference.	j00ru.vexillium.org
SilkETW / SealighterTI	User-friendly consumers for ETW providers including `Microsoft-Windows-Threat-Intelligence`.	github.com

14. MITRE ATT&CK Mapping

Technique	MITRE ID	Detection
Native API	T1106	EID 10 `CallTrace` containing `UNKNOWN`; ETW-TI `AllocVm`/`ProtectVm` from unbacked memory.
Process Injection	T1055	Cross-process `NtAllocateVirtualMemory` + `NtWriteVirtualMemory` + `NtCreateThreadEx` chain via ETW-TI.
DLL Injection	T1055.001	EID 7/8 plus ETW-TI write/protect events into a remote PID.
PE Injection	T1055.002	RWX private allocations followed by remote thread creation.
Process Hollowing	T1055.012	`NtUnmapViewOfSection` followed by `NtWriteVirtualMemory` into the primary image base.
Rootkit	T1014	PatchGuard `0x109` bugchecks; SSDT integrity scans in memory forensics.
Impair Defenses: Disable/Modify Tools	T1562.001	Driver loads with revoked or vulnerable signatures; HVCI/DSE violations.

Summary

Every Windows syscall is a SYSCALL instruction that lands at KiSystemCall64 via MSR_LSTAR and is dispatched through KiServiceTable using the EAX SSN.
The SSDT on x64 stores encoded offsets, not raw pointers — base + (entry >> 4) — and the EAX bit 12 selects between the core and Win32k Shadow tables.
PatchGuard killed SSDT hooking on x64; modern offense has moved to direct and indirect syscalls in user mode and to BYOVD when ring 0 is required.
HVCI/VBS is the strongest defense against the kernel half; kernel ETW (Microsoft-Windows-Threat-Intelligence) is the strongest defense against direct/indirect syscalls because it fires after the transition.
Detect with Sysmon EID 10 CallTrace (unbacked memory in the stack), enrich with ETW-TI, and map to MITRE T1106 / T1055 for response.

References

HAL and Ntoskrnl: The Kernel Core Components

Objective: Understand the architecture and division of labor between hal.dll (the Hardware Abstraction Layer) and ntoskrnl.exe (the NT kernel and Executive), how they are loaded during boot, the structures and routines each exposes, and how defenders inspect, detect tampering against, and harden these Ring 0 core components.

1. HAL and Ntoskrnl Overview

Two binaries sit at the bottom of Windows kernel mode and everything else builds on them. ntoskrnl.exe is the NT kernel plus the Executive — the policy and service layer of the OS. hal.dll is the Hardware Abstraction Layer — a thin platform shim that hides interrupt controllers, bus topology, timers, and DMA behind a uniform interface so the rest of the kernel stays hardware-independent.

Binary	Full name	Loaded by	Ring
`ntoskrnl.exe`	NT OS Kernel + Executive	`winload.efi`	Ring 0
`hal.dll`	Hardware Abstraction Layer	`winload.efi`	Ring 0

Both reside in %SystemRoot%\System32\. On multiprocessor systems the SMP-aware image ntkrnlmp.exe is selected by the loader and presented as ntoskrnl.exe; modern Windows 10/11 ships only the SMP variant. Verify image identity and signature on a live host with sigcheck, dumpbin /headers, or the WinDbg lm command. The separation exists for portability (HAL absorbs platform differences) and layering (the kernel implements scheduling and policy, not chipset quirks).

2. Boot Handoff: From Bootloader to KiSystemStartup

winload.efi loads ntoskrnl.exe and hal.dll into memory, then transfers control to the kernel entry point KiSystemStartup, passing a pointer to a LOADER_PARAMETER_BLOCK. That structure carries the memory descriptor list, the ARC hardware tree, NLS data, and other boot-time state the kernel needs before it can manage its own memory.

winload.efi
  └─ loads ntoskrnl.exe + hal.dll
       └─ ntoskrnl!KiSystemStartup(PLOADER_PARAMETER_BLOCK)
            ├─ HalInitializeProcessor()    ; HAL brings up per-CPU hardware
            ├─ KiInitializeKernel()        ; KPCR/KPRCB, IDT, GDT
            ├─ Executive phase init:
            │    Mm/Ob/Se/Io/Cm/Ps InitSystem()
            └─ PsInitialSystemProcess()    ; System process (PID 4)
                 └─ Phase 1: smss.exe launched

HAL initializes the processor before the Executive runs a single line of policy code. Secure Boot validates the winload.efi → ntoskrnl.exe / hal.dll chain in firmware, so tampering with either binary on disk breaks the boot chain on a properly configured machine.

Boot sequence flow diagram showing UEFI firmware validating winload.efi which loads hal.dll and ntoskrnl.exe passing a LOADER_PARAMETER_BLOCK before the Executive initializes — Secure Boot validates each link in the chain; winload.efi loads both HAL and the kernel before handing off control to KiSystemStartup.

3. The HAL: Abstracting the Hardware

The HAL translates abstract requests into platform-specific operations: programming the APIC, translating bus-relative addresses, allocating DMA-coherent buffers, and calibrating the stall timer. Drivers and the kernel call HAL routines instead of touching hardware registers directly.

Routine	Purpose
`HalGetInterruptVector`	Translate a bus IRQ to a system interrupt vector and required IRQL
`HalTranslateBusAddress`	Convert a bus-relative address to a logical address
`HalAllocateCommonBuffer`	Allocate DMA-coherent memory visible to CPU and device
`KeStallExecutionProcessor`	Calibrated busy-wait (HAL-implemented on most platforms)
`HalRequestSoftwareInterrupt`	Request a software interrupt at a given IRQL to trigger DPC delivery

On modern ACPI systems the HAL is far thinner than in the NT 4 era. Many classic Hal* exports such as HalGetInterruptVector are deprecated; the PnP/ACPI stack and IoConnectInterruptEx now handle interrupt wiring. Since Windows 8, HAL Extensions (halextpcat.dll, halextintc.dll, and similar PE images loaded by HAL itself) carry SoC- and OEM-specific code without replacing the whole HAL.

4. IRQL: The Kernel’s Preemption Ladder

Interrupt Request Level (IRQL) is the central arbitration mechanism shared by HAL and the kernel. The HAL programs the interrupt controller to enforce IRQL in hardware; running at an IRQL masks all interrupts at or below that level on the current CPU.

IRQL (x64)	Symbolic name	Used for
0	`PASSIVE_LEVEL`	Normal thread execution
1	`APC_LEVEL`	APC delivery; paging allowed
2	`DISPATCH_LEVEL`	Scheduler, spin locks; no paging, no blocking
3–12	Device IRQLs	Hardware ISRs
13	`CLOCK_LEVEL`	Clock interrupt
14	`PROFILE_LEVEL`	Profiling interrupt
15	`HIGH_LEVEL`	NMI, machine check

The cardinal rule: at DISPATCH_LEVEL or above you may not touch pageable memory or block, because the scheduler and page fault handler cannot run. A driver that dereferences paged-out memory at elevated IRQL produces the classic IRQL_NOT_LESS_OR_EQUAL bug check. Query the current level with KeGetCurrentIrql(). IRQL numeric values are architecture-specific; the table above is the canonical x64 mapping.

Hierarchy diagram of Windows x64 IRQL levels from PASSIVE at 0 up through APC, DISPATCH, CLOCK, IPI, POWER to HIGH at 31 showing preemption priority — Running at DISPATCH_LEVEL or above masks the scheduler and page-fault handler — any pageable memory access at this level triggers an IRQL_NOT_LESS_OR_EQUAL bug check.

5. The Kernel Layer (Ke): Scheduling and Synchronization

The Ke layer sits directly above HAL and implements thread scheduling, interrupt and exception dispatch, and the low-level synchronization primitives the rest of the system depends on.

Routine	What it does
`KeInitializeSpinLock`	Initialize a spin-lock object
`KeAcquireSpinLock`	Raise IRQL to `DISPATCH_LEVEL` and acquire the lock
`KeReleaseSpinLock`	Release the lock and restore the saved IRQL
`KeInsertQueueDpc`	Queue a Deferred Procedure Call
`KeWaitForSingleObject`	Wait on a dispatcher object (event, mutex, timer, thread)
`KeSetEvent`	Set a kernel event to the signaled state

Dispatcher objects — events, mutexes, semaphores, timers, threads — share a common DISPATCHER_HEADER carrying Type, SignalState, and WaitListHead. The wait machinery keys off that header. The synchronization pattern below runs at PASSIVE_LEVEL, where blocking is legal:

KEVENT readyEvent;
KeInitializeEvent(&readyEvent, NotificationEvent, FALSE);

// ... another thread eventually calls KeSetEvent(&readyEvent, IO_NO_INCREMENT, FALSE);

NTSTATUS status = KeWaitForSingleObject(
    &readyEvent,        // dispatcher object
    Executive,          // wait reason
    KernelMode,         // processor mode
    FALSE,              // non-alertable
    NULL);              // no timeout

Per-CPU scheduler state lives in the KPCR (Kernel Processor Control Region), reachable via gs:[0] on x64, with an embedded KPRCB holding CurrentThread, NextThread, IdleThread, and the DPC queue.

6. The Executive Layer (Ex and Friends)

The Executive comprises the higher-level managers, each identified by a two-letter prefix. They build on Ke primitives and HAL services.

Manager	Prefix	Responsibilities
Object Manager	`Ob`	Object lifecycle, handles, reference counting
Process/Thread Manager	`Ps`	`EPROCESS`/`ETHREAD` creation and teardown
Memory Manager	`Mm`	VAD trees, PTEs, page faults, pool
I/O Manager	`Io`	IRP lifecycle, driver loading
Security Reference Monitor	`Se`	Access checks, tokens, privileges
Configuration Manager	`Cm`	Registry hive management
Executive Support	`Ex`	Pool allocation, lookaside lists, callbacks

Correct pool usage on modern Windows uses ExAllocatePool2 (the successor to ExAllocatePoolWithTag, deprecated starting Windows 10 build 19041) paired with ExFreePoolWithTag:

// Allocate non-paged pool with a 4-byte tag (read in WinDbg as 'XgAT').
PVOID buffer = ExAllocatePool2(POOL_FLAG_NON_PAGED, 0x1000, 'TAgX');
if (buffer != NULL) {
    // ... use buffer at IRQL <= DISPATCH_LEVEL ...
    ExFreePoolWithTag(buffer, 'TAgX');
}

The Object Manager exposes ObReferenceObjectByHandle to convert a handle into a referenced kernel object pointer — the gateway every component crosses when validating access.

7. Key Kernel Structures

A handful of structures are the backbone of process, thread, and CPU state. Defenders and rootkit authors alike walk these every day.

Structure	Key fields
`EPROCESS`	`UniqueProcessId`, `ActiveProcessLinks`, `Token`, `VadRoot`, `Peb`, `ImageFileName[15]`, `ThreadListHead`
`ETHREAD`	`Cid` (CLIENT_ID), `ThreadListEntry`, `Win32StartAddress`, embedded `KTHREAD`
`KTHREAD`	`Header` (DISPATCHER_HEADER), `KernelStack`, `State`, `WaitIrql`, `Teb`
`KPCR`	Per-CPU; IRQL, IDT/GDT pointers, pointer to `KPRCB`
`KPRCB`	`CurrentThread`, `NextThread`, `IdleThread`, DPC queue
`KDPC`	`DeferredRoutine`, `DeferredContext`, `DpcListEntry`

ActiveProcessLinks is a doubly linked LIST_ENTRY chaining every EPROCESS. The Task Manager view of “all processes” is, at bottom, a walk of this list. That makes it a prime DKOM target: unlinking an EPROCESS hides the process from list-based enumeration while it continues to run and be scheduled — covered in Section 10.

8. The SSDT and System Call Dispatch

A user-mode SYSCALL instruction transfers Ring 3 → Ring 0 and lands in ntoskrnl!KiSystemCall64. The dispatcher indexes the System Service Dispatch Table via KeServiceDescriptorTable, which points at KiServiceTable (an array of service routine offsets) and KiArgumentTable (argument byte counts). GUI calls into win32k.sys route through the shadow table KeServiceDescriptorTableShadow.

Patching KiServiceTable so a service index points at attacker code is the classic SSDT hook, historically used by rootkits to intercept NtQuerySystemInformation, NtOpenProcess, and similar. On x64 this is exactly the kind of structure modification PatchGuard validates, so SSDT hooking is loud and largely obsolete on modern systems — but understanding the dispatch path is essential for reading both live disassembly and integrity-check telemetry.

Flow diagram of the Windows system call dispatch path from user-mode SYSCALL instruction through KiSystemCall64 and KeServiceDescriptorTable to the target Nt service routine — The SYSCALL instruction transfers execution to KiSystemCall64, which uses the service index to look up the target routine in KiServiceTable — the structure SSDT hooks manipulate and PatchGuard protects.

9. Live Analysis with WinDbg and Volatility

Load Microsoft symbols and the entire layout becomes navigable. List the core modules and dump structures directly:

0: kd> lm m nt              ; ntoskrnl base, range, symbols
0: kd> lm m hal             ; hal.dll base and range
0: kd> dt nt!_EPROCESS      ; full EPROCESS field layout
0: kd> !process 0 0         ; enumerate processes via ActiveProcessLinks
0: kd> !pcr 0               ; KPCR for CPU 0
0: kd> !prcb 0              ; KPRCB: CurrentThread / IdleThread
0: kd> dps nt!KeServiceDescriptorTable   ; SSDT pointer + service count
0: kd> !idt                 ; IDT vectors (HAL-programmed interrupt routing)

For dead-box memory forensics, Volatility 3 reconstructs the same view from a dump and is the natural cross-check against a possibly compromised live host:

# Enumerate processes and loaded kernel modules from a memory image.
vol -f memory.dmp windows.pslist
vol -f memory.dmp windows.modules

# psscan walks pool tags instead of ActiveProcessLinks; a process that
# appears in psscan but NOT in pslist is a candidate DKOM-unlinked process.
vol -f memory.dmp windows.psscan

A delta between windows.pslist (list-based) and windows.psscan (pool-scan-based) is a high-fidelity indicator of ActiveProcessLinks tampering.

10. Common Attacker Techniques

Kernel-core abuse turns on either modifying ntoskrnl structures from a loaded driver or exploiting a vulnerability to reach Ring 0 in the first place.

Technique	Description
SSDT hooking	Patch `KiServiceTable` entries to intercept syscalls
DKOM unlinking	Splice an `EPROCESS` out of `ActiveProcessLinks` to hide a process
Kernel callback removal	Strip `PsSetCreateProcessNotifyRoutine` entries to blind EDR
BYOVD	Load a vulnerable signed driver to gain a Ring 0 primitive
Kernel exploitation	Abuse an ntoskrnl/HAL bug to escalate Ring 3 → Ring 0
In-memory image patch	Patch `ntoskrnl.exe` code pages at runtime

A malicious driver is still loaded through the documented path — a Services registry key of Type = 1 followed by a load — which is exactly where detection begins. Bring-Your-Own-Vulnerable-Driver remains popular precisely because it sidesteps the need to find a fresh kernel bug.

Graph diagram showing attacker path from BYOVD through Ring 0 code execution branching into DKOM process unlinking, SSDT hooking, and callback removal all leading to hidden process or driver impact — BYOVD is the most common Ring 0 entry point; once there, attackers choose between DKOM, SSDT hooks, or callback removal to achieve persistence and evasion.

11. Defensive Strategies & Detection

Detection centers on driver loads, integrity events, and kernel structure cross-checks.

Sysmon Event ID	Name	Relevance
`6`	Driver Loaded	Kernel driver load with `Signed`, `Hashes`, `Signature` fields
`7`	Image Loaded	Module loads in unusual contexts
`13`	Registry Value Set	New `Services` driver entries

Pair Sysmon with Windows event sources: System Event ID 7045 (new kernel-mode service installed), Security Event ID 5038 (image hash invalid — DSE failure), and Event ID 6281 (page hash mismatch). The Microsoft-Windows-Kernel-Memory ETW provider surfaces pool allocations useful for hunting pool-based implants.

title: Suspicious Unsigned Kernel Driver Load
logsource:
  product: windows
  service: sysmon
detection:
  selection:
    EventID: 6
    Signed: 'false'
  filter_legit:
    ImageLoaded|startswith:
      - 'C:\Windows\System32\drivers\'
      - 'C:\Windows\System32\DriverStore\'
  condition: selection and not filter_legit
level: high

Mechanism	Description
PatchGuard (KPP)	Validates SSDT, IDT, GDT, `KPCR`, and kernel code; bug check `0x109` on tampering
Driver Signature Enforcement	`ci.dll` requires Authenticode-signed drivers
HVCI	VTL1 enforces signed Ring 0 code; blunts BYOVD and runtime patching
Secure Boot	Validates the `winload → ntoskrnl/hal` chain in firmware

Operational hardening: enable HVCI (Core Isolation → Memory Integrity), confirm Secure Boot in msinfo32, audit SeLoadDriverPrivilege use, deploy the Microsoft Vulnerable Driver Blocklist (DriverSiPolicy.p7b), monitor HKLM\SYSTEM\CurrentControlSet\Services\ for new Type = 1 entries, and baseline loaded-module hashes against periodic WinPmem/Volatility snapshots.

12. MITRE ATT&CK Mapping

Technique	MITRE ID	Detection
Rootkit	`T1014`	Volatility pslist/psscan delta; PatchGuard bug check `0x109`
Kernel Modules and Extensions	`T1547.006`	Sysmon EID 6; Event ID 7045; Services key writes
Exploitation for Privilege Escalation	`T1068`	Crash telemetry, anomalous Ring 0 transitions
Impair Defenses	`T1562.001`	Missing kernel callbacks; EDR self-protection alerts
Process Injection	`T1055`	Kernel `KeStackAttachProcess`/`MmCopyVirtualMemory` use
Modify System Image	`T1601.001`	Code integrity Event ID 5038/6281; PatchGuard

13. Tools for Kernel Analysis

Tool	Description	Link
WinDbg	Live and dump kernel debugging, structure walks	`microsoft.com`
Volatility 3	Memory forensics, pslist/psscan/modules	`volatilityfoundation.org`
WinPmem	Live memory acquisition	`github.com`
Process Hacker	Driver and handle inspection	`processhacker.sourceforge.io`
Sysmon	Driver-load and registry telemetry	`sysinternals.com`
sigcheck	Image signature and hash verification	`sysinternals.com`
Ghidra	Static analysis of drivers and ntoskrnl	`ghidra-sre.org`

14. Summary

HAL and ntoskrnl are the two Ring 0 binaries every other Windows component is built on — HAL abstracts hardware, ntoskrnl implements the kernel and Executive policy layers.
The kernel layer (Ke) supplies scheduling and synchronization; the Executive (Ob, Ps, Mm, Io, Se, Cm, Ex) builds managers on top, all arbitrated by IRQL that the HAL enforces in hardware.
Core structures — EPROCESS, ETHREAD, KPCR, the SSDT — are the backbone of process and CPU state and the prime targets for SSDT hooks, DKOM unlinking, and callback removal.
Detect kernel tampering via Sysmon Event ID 6, Event IDs 7045/5038/6281, and Volatility pslist-vs-psscan deltas; prevent it with HVCI, DSE, Secure Boot, and the vulnerable-driver blocklist.

1. Why Tokens Exist

2. Inside nt!_TOKEN

3. Primary vs. Impersonation Tokens

4. Integrity Levels and Mandatory Integrity Control

5. Reading a Token from User Mode

6. Privileges: Present, Enabled, Removed

7. Impersonation in Depth

8. Duplication, LogonUser, and Process Creation Under a Token

9. _EPROCESS.Token and Kernel-Mode Abuse

10. Detection and Defense

Windows Security Audit Events

Sysmon

Sigma Sketch: Privilege Enable on a Sensitive Right

ETW Providers

Hardening

Related Tutorials

References

1. Identity Before Access

2. Anatomy of a SID

3. Well-Known SIDs and Built-in Principals

4. SIDs at Runtime: The Access Token

5. The Security Descriptor: Structure and Fields

6. DACLs and ACEs: How Access Is Decided

7. SACLs: Auditing Through the System ACL

8. SDDL: Security Descriptors as Text

9. Inheritance and the Kernel Check

10. Common Attacker Techniques

11. Detection, Hunting, and Hardening

MITRE ATT&CK Mapping

12. Tools

Summary

Related Tutorials

References

1. Cooperative vs. Preemptive Scheduling

2. The Fiber Execution Model

3. TEB Layout and the FIBER Structure

4. The Core Fiber API

5. Fiber Lifecycle: A Minimal Example

6. Context Switching Internals

7. Fiber Local Storage (FLS)

8. Building a Round-Robin Cooperative Scheduler

9. Legitimate Use Cases and Pitfalls

10. Common Attacker Techniques

11. Defensive Strategies & Detection

12. Tools for Fiber Analysis

13. MITRE ATT&CK Mapping

Summary

Related Tutorials

References

1. What Is a Job Object?

2. Core Job Object APIs

3. Basic Limits: CPU, Memory, and Process Count

4. Extended and Rate Limits

5. Notification Limits and I/O Completion Ports

6. Nested Jobs

7. Inspecting Jobs at Runtime

8. Silos: From Jobs to Containers

9. Windows Containers and the Host Compute Service

10. Common Attacker Techniques

11. Defensive Strategies & Detection

12. MITRE ATT&CK Mapping

13. Tools for Job and Silo Analysis

Summary

Related Tutorials

References

1. The Scheduling Contract: Threads, Not Processes

2. The 32-Level Priority Model and Priority Classes

3. Key Kernel Structures

_KTHREAD (Thread Control Block)

_KPRCB (Kernel Processor Control Block)

_KPROCESS (Process Control Block)

4. Dispatcher Ready Queues and ReadySummary

5. Quantum Mechanics

6. Thread Selection: The Dispatch Path

7. Priority Boosts and Decay

8. Multiprocessor Scheduling, Affinity, and NUMA

9. Thread States: The Full State Machine

10. Observing the Scheduler with WinDbg and ETW

11. Common Attacker Techniques

12. Defensive Strategies & Detection

2. Inside `nt!_TOKEN`

8. Duplication, `LogonUser`, and Process Creation Under a Token

9. `_EPROCESS.Token` and Kernel-Mode Abuse

3. Kernel Structures: `KAPC`, `KAPC_STATE`, `KTHREAD`

12.2 ETW — `Microsoft-Windows-Threat-Intelligence`

12.6 PowerShell — Hunt for Suspicious `ProcessAccess` Masks