Emulation Plan Architecture: Structuring Phases, Objectives, Scenarios, and Success Criteria
There’s a sentence I’ve learned to dread in a post-engagement readout: “We ran a red team and got domain admin in four hours.” Great — against whose tradecraft? Using which delivery vector that an actual adversary uses? A generic red team answers “can we break in.” An adversary emulation plan (AEP) answers a harder, more useful question: “If FIN6 walked into this environment tomorrow, would we see them, and where exactly would we go blind?”
That difference is architectural. You can’t measure detection coverage against a threat you didn’t model on purpose. This tutorial walks through how to build an AEP the way MITRE’s Center for Threat-Informed Defense (CTID) structures theirs — intelligence summary, phases, operational flow, scenarios, objectives, success criteria — and how to wire it into CALDERA for repeatable, scorable execution. Everything here is for defenders, purple teamers, and authorized red teams instrumenting their own networks.
Contents
- 1 What Separates Emulation From a Generic Red Team
- 2 The Intelligence Foundation
- 3 Decomposing the Campaign Into Phases
- 4 Designing the Operational Flow
- 5 Writing Scenarios: From Flow to Procedures
- 6 Defining Objectives and Success Criteria
- 7 Rules of Engagement, Scope, and Authorization
- 8 Machine-Readable Plans: YAML and CALDERA
- 9 Measuring Outcomes and the Purple Team Loop
- 10 Defensive Strategies & Detection
- 11 Tools for Emulation Plan Architecture
- 12 MITRE ATT&CK Mapping
- 13 Summary
- 14 Related Tutorials
- 15 References
What Separates Emulation From a Generic Red Team
A penetration test hunts for any exploitable weakness. A generic red team chases an objective — usually domain admin or a crown-jewel dataset — by whatever path works. Adversary emulation is narrower and more disciplined: operators replicate the documented behavior of a specific named actor or a specific compound technique, sticking to that actor’s known TTPs while allowing latitude in implementation.
MITRE built AEPs as prototype documents assembled from public threat reporting and ATT&CK, so red teams can model adversary behavior and defenders can test their networks against it. The design goal is subtle but load-bearing: you’re building analytics for ATT&CK behaviors, not signatures for one IOC or one tool binary. Catch the behavior and you catch the next actor who reuses it.
CTID publishes two flavors:
| Plan Type | Scope |
|---|---|
| Full emulation | End-to-end replication of one adversary (initial access → exfiltration), e.g. FIN6 or APT29 |
| Micro emulation | A single compound behavior reused across many actors, decoupled from attribution |
Pick full emulation when you’re validating against a named threat in your intel picture. Pick micro emulation for continuous, lightweight control validation between the big set-pieces.
The Intelligence Foundation
Every AEP starts with CTI research, and skipping it is the most common way to produce a plan that’s really just your favorite tooling wearing an actor’s name. The four operational steps of building an AEP are CTI research → technique selection → offensive development → emulation execution, and the first one feeds everything downstream.
Research a candidate using public sources: the actor’s ATT&CK Group page, CISA advisories, and vendor threat reports. Confirm the actor is both relevant to your sector and a significant or growing threat before committing. Then extract attributed techniques across a wide range of tactics and cite every one.
The Intelligence Summary is the first canonical AEP component. It carries the adversary overview — objectives, targets, tools — plus the attributed TTPs and the sources behind each claim.
# Intelligence Summary — <Actor Name> (<Group ID — verify on attack.mitre.org/groups>)
## Actor Overview
One paragraph of attribution and history. Cite every factual claim.
## Motivations
Financial | Espionage | Destruction | Hacktivism
## Target Sectors & Geography
...
## Attributed TTPs
| Tactic | Technique | ATT&CK ID | Source |
|-------------------|----------------------------|-----------|--------|
| Initial Access | Spearphishing Attachment | T1566.001 | [1] |
| Execution | PowerShell | T1059.001 | [2] |
| Credential Access | LSASS Memory | T1003.001 | [3] |
## Cited Sources
1. Vendor report ...
2. CISA advisory ...
A word of caution: ATT&CK Group IDs get reviewed and renumbered between versions. Confirm the actor’s current G-ID at attack.mitre.org/groups before you publish a plan — I’ve seen a stale G-ID survive three engagements because nobody re-checked.
Decomposing the Campaign Into Phases
Phases are the foundational structural unit of an AEP. A phase is an ordered cluster of ATT&CK tactics that represents a logical stage of the operation, described in terms of the adversary’s goal and how they achieve it.
The original MITRE APT3 plan used three phases:
- Initial compromise / setup
- Network propagation
- Collection / exfiltration
CTID’s FIN6 plan compresses to two: Phase 1 focuses on initial access and placement, then exfiltrating data identified during that phase. APT29, the canonical CTID reference, is organized differently again — an infrastructure section that prepares the environment, plus two scenarios defined in the operations flow.
The lesson: there is no fixed phase count. To build yours, identify the tactics the actor uses, then the techniques and procedures for each, and group them where natural boundaries exist. Split a phase when the strategic objective changes (foothold → internal pivot). Merge phases when the actor’s tradecraft blurs the line. Write each phase as a self-contained block with explicit entry and exit conditions:
## Phase 2 — Internal Pivot & Credential Access
- **Strategic Objective:** Move from the initial foothold to a domain-joined
host and obtain reusable credentials.
- **ATT&CK Tactics Covered:** TA0007 Discovery, TA0006 Credential Access,
TA0008 Lateral Movement
- **Entry Condition:** Stable C2 on at least one workstation (Phase 1 exit).
- **Exit Condition:** Operator holds valid domain credentials usable on a
second host.
- **Success Criteria (binary):**
- [ ] LSASS credential material extracted from one host
- [ ] Lateral authentication to a second host succeeds
The first AEP I shipped had beautiful phases and no exit conditions. Operators kept asking “are we done with Phase 2 yet?” over Signal because nobody had written down what “done” meant. Entry/exit conditions are not bureaucracy — they’re the only thing that makes a phase scorable.

Designing the Operational Flow
The Operational Flow is the second canonical component. It chains the individual techniques into a logical narrative — the major steps that commonly occur across the actor’s real operations — and becomes the authoritative sequencing reference every downstream scenario step must obey.
Think of it as a directed graph: nodes are techniques, edges are “this enables that.” Spearphishing Attachment (T1566.001) enables PowerShell execution (T1059.001), which enables Discovery (TA0007), which enables LSASS dumping (T1003.001), which enables SMB lateral movement (T1021.002). Render it as a diagram in the human-readable plan and as an ATT&CK Navigator layer for coverage visualization. The flow is where you sanity-check that your phases actually connect — if a technique has no enabling predecessor, your intelligence has a gap or your sequencing is wrong.

Writing Scenarios: From Flow to Procedures
A scenario translates one stretch of the operational flow into TTP-by-TTP, command-by-command operator instructions. This is the third canonical component — the Emulation Plan itself — the walkthrough that implements the actor’s tradecraft.
Each step should carry: tactic, technique ID, procedure description, the concrete command, and expected output. Scenarios can run end-to-end or as isolated behaviors, and teams routinely customize them to fit their environment or fresh intelligence.
Scenario 1 vs. Scenario 2
APT29’s two-scenario structure is the pattern worth stealing. Scenario 1 is the targeted, stealthy path — low-volume, tradecraft-faithful, the version that tests whether your quiet detections fire. Scenario 2 is broader and noisier — more techniques, faster, the version that tests breadth of telemetry. Run Scenario 1 to find subtle blind spots; run Scenario 2 to confirm you catch the obvious. Both draw from the same operational flow; they differ in volume and stealth, not in attribution.
Defining Objectives and Success Criteria
Phases are described by the adversary’s intended goal. Objectives make that goal measurable and binary — no “improve lateral movement detection,” which is unscoreable. Write “Operator achieves SYSTEM on at least one domain-joined host” or “Operator exfiltrates a 10 MB staged archive over the C2 channel.”
Success criteria split into three categories, and a mature plan scores all three independently:
| Category | Question it answers |
|---|---|
| Offensive | Did the operator achieve the objective? |
| Defensive | Did an alert fire and did an analyst respond? |
| Coverage | Was the technique executed and logged, even if not alerted? |
That third row is the one teams forget, and it’s the most valuable. A technique can succeed offensively, generate zero alerts, and still leave telemetry — which means the detection content is the gap, not the sensor. Map each criterion to an ATT&CK Navigator coverage layer so the output is a heat map, not a paragraph.

Rules of Engagement, Scope, and Authorization
None of the above runs until the planning and scoping phase is in writing. Red teamers work with stakeholders to clarify objectives, set boundaries, define success criteria, and establish legal and compliance parameters. The authorization document must contain, at minimum:
- Target system inventory and explicitly out-of-scope assets (map these back to specific plan phases — if Phase 3 touches a system that’s off-limits, you know before execution, not during)
- Permitted vs. prohibited techniques (e.g., LSASS dumping allowed in audit only; no destructive
TA0040actions) - Emergency stop / rollback procedure and who can invoke it
- Communications plan — out-of-band channel, escalation contacts
- Legal authorization signed by someone empowered to grant it
Tie every scope limit to a phase. “No production database access” is an exit-condition modifier for whatever phase reaches collection.
Machine-Readable Plans: YAML and CALDERA
The human-readable plan is paired with a machine-readable plan in YAML, designed so each step couples directly to its human-readable equivalent. CTID’s schema started from Red Canary’s Atomic Red Team format and extended it to capture threat intelligence. That YAML is what CALDERA ingests for automated execution — CALDERA lets you define adversary profiles by ATT&CK technique ID, deploy agents, and run operations that follow the playbook or autonomously chain techniques based on what they discover.
A single annotated step:
- id: 1.A.1
name: Spearphishing Attachment
description: >
Deliver a weaponized document to a target inbox to gain initial code
execution, mirroring the actor's documented delivery TTP.
tactic: initial-access # TA0001
technique:
attack_id: T1566.001
name: "Phishing: Spearphishing Attachment"
platforms: [windows]
executor:
name: manual # operator action; no payload shown
command: >
# Send crafted document from staging mailbox per ROE
cleanup: >
Remove delivered artifact from target host and mailbox.
Because each YAML step mirrors a human-readable step, you can generate an operator checklist mechanically — useful for keeping the two representations in sync:
import yaml
def load_plan(path):
with open(path) as f:
return yaml.safe_load(f)
def build_checklist(steps):
for step in steps:
tech = step["technique"]
print(f"[{step['id']}] {step['name']}")
print(f" ATT&CK : {tech['attack_id']} ({tech['name']})")
print(f" Tactic : {step['tactic']}")
print(f" Action : {step['executor']['command'].strip()}")
print(f" Expect : {step.get('expected_artifact', 'see human-readable plan')}")
print("-" * 60)
plan = load_plan("emulation_plan.yaml")
build_checklist(plan["steps"])
Measuring Outcomes and the Purple Team Loop
After execution, score every step. Keep the rubric flat and binary so two analysts produce the same numbers:
| Step Name | ATT&CK ID | Executed (Y/N) | Alert Generated (Y/N) | Analyst Notified (Y/N) | Blocked (Y/N) | Notes |
|---|---|---|---|---|---|---|
| Spearphishing delivery | T1566.001 | Y | N | N | N | No mail-sandbox detonation |
| PowerShell execution | T1059.001 | Y | Y | Y | N | ScriptBlock 4104 fired |
| LSASS dump | T1003.001 | Y | Y | N | Y | EDR blocked; no analyst page |
Roll those rows into an ATT&CK Navigator layer to visualize coverage — Phase 1 in one color band, Phase 2 in another:
{
"name": "AEP Coverage — Phase 1 vs Phase 2",
"domain": "enterprise-attack",
"techniques": [
{ "techniqueID": "T1566.001", "score": 1, "color": "#66b1ff",
"comment": "Phase 1 — Initial Access (executed, not detected)" },
{ "techniqueID": "T1003.001", "score": 2, "color": "#ff9f40",
"comment": "Phase 2 — Credential Access (blocked)" }
]
}
The debrief is where value compounds. Map identified TTPs back to specific CALDERA abilities, reconstruct the full chain, and feed it into continuous purple teaming — re-running the failed steps after each detection fix until the heat map goes green.

Defensive Strategies & Detection
This is the part defenders own: instrument the environment so each phase produces measurable telemetry before you run it. If the sensor’s blind, your “missed” score is meaningless.
Sysmon coverage per phase
| Phase Category | Key Sysmon Event IDs |
|---|---|
| Initial Access / Execution | EID 1 (Process Create), EID 3 (Network Connection), EID 7 (Image Load), EID 11 (File Create) |
| Persistence | EID 13 (Registry Set), EID 12 (Registry Create/Delete), EID 1 (new service/task process) |
| Privilege Escalation | EID 1 (parent/child token anomalies), EID 10 (Process Access — LSASS reads) |
| Lateral Movement | EID 3 (outbound SMB/WinRM), EID 1 (PsExec/WMIC children), EID 25 (Process Tampering) |
| Collection / Exfiltration | EID 11 (staging writes), EID 3 (outbound to C2) |
ETW providers and audit policy
Enable Microsoft-Windows-Security-Auditing (4624/4625/4648 logon, 4672 privilege use, 4688 with command line), Microsoft-Windows-PowerShell/Operational (4104 ScriptBlock, 4103 module), and Microsoft-Windows-WMI-Activity/Operational (5857–5861). Turn on Audit Process Creation, Audit Logon, Audit Object Access, Audit Privilege Use, and Audit Detailed File Share.
A Sigma sketch to validate a credential-access phase objective:
title: LSASS Memory Access — AEP Credential-Access Phase Validation
logsource:
product: windows
service: sysmon
detection:
selection:
EventID: 10
TargetImage|endswith: '\lsass.exe'
GrantedAccess: '0x1410'
condition: selection
level: high
Before you score any “Blocked” criterion, confirm EDR is in blocking mode, not detect-only. Run ASR rules in audit mode during emulation so the plan captures what would have been blocked, and baseline your Sysmon config on a known-good template (SwiftOnSecurity or Olaf Hartong) so missing event IDs don’t masquerade as missing attacks.
Tools for Emulation Plan Architecture
| Tool | Description | Link |
|---|---|---|
| MITRE CALDERA | Ingests YAML plans; runs agent-based, ATT&CK-mapped operations | caldera.readthedocs.io |
| ATT&CK Navigator | Coverage heat maps and per-phase layer overlays | mitre-attack.github.io |
| Atomic Red Team | Compatible YAML test format; per-technique atomics | atomicredteam.io |
| CTID Emulation Library | Reference full/micro plans (APT29, FIN6, APT3) | ctid.mitre.org |
| Sysmon | Process/network/registry telemetry for outcome scoring | sysinternals.com |
| Sigma | Portable detection rules for validating phase outcomes | sigmahq.io |
MITRE ATT&CK Mapping
| Technique | MITRE ID | Detection |
|---|---|---|
| Initial Access (tactic) | TA0001 | Mail sandboxing; EID 11 artifact writes |
| Execution (tactic) | TA0002 | EID 1 + 4688 command line; 4104 ScriptBlock |
| Credential Access (tactic) | TA0006 | EID 10 LSASS reads; 4672 privilege use |
| Lateral Movement (tactic) | TA0008 | EID 3 outbound SMB; PsExec/WMIC children |
| Exfiltration (tactic) | TA0010 | EID 3 to C2; egress volume baselines |
| Spearphishing Attachment | T1566.001 | Mail detonation; EID 1 office-spawn chains |
| PowerShell | T1059.001 | 4104/4103; Constrained Language Mode |
| LSASS Memory Dumping | T1003.001 | EID 10 GrantedAccess 0x1410; EDR block |
| SMB/Admin Shares | T1021.002 | EID 3 + Audit Detailed File Share |
| Exfiltration Over C2 | T1041 | Initiated outbound to known C2 |
| Web Protocols (C2) | T1071.001 | Proxy/JA3 anomalies; infra-phase setup |
Reference group profiles for examples: APT3 (G0022, original three-phase plan), APT28 (G0007), APT29 (G0016, canonical CTID two-scenario reference), and FIN6 (verify current G-ID on attack.mitre.org/groups — it has been renumbered across ATT&CK versions).
Summary
- An emulation plan is architecture, not improvisation — intelligence summary, operational flow, and the TTP-by-TTP emulation plan are the three load-bearing components, and every step traces back to cited intelligence and an ATT&CK ID.
- Phases are ordered tactic clusters with explicit entry and exit conditions — APT3 used three, FIN6 uses two; let the actor’s tradecraft and your scoped objectives decide the count, never a template.
- Scenarios turn flow into commands, and objectives turn goals into binary pass/fail — score offensive success, defensive response, and logging coverage as three separate measurements.
- The YAML plan couples human-readable steps to CALDERA execution, making runs repeatable and the purple-team feedback loop continuous.
- Detection coverage is the real deliverable — instrument Sysmon
EID 1/3/10/11, ScriptBlock 4104, and per-phase audit policy before execution, then render results as an ATT&CK Navigator heat map to expose blind spots.
Related Tutorials
- Adversary Emulation vs. Adversary Simulation: Definitions, Differences, and Why It Matters
- APT Profiling: How to Build a Comprehensive Adversary Profile from Open-Source Intelligence
- Mapping CTI Reports to ATT&CK TTPs: A Step-by-Step Methodology
- Cyber Threat Intelligence (CTI) Fundamentals: Sources, Types, and the Intelligence Lifecycle
- Navigating ATT&CK Navigator: Building, Annotating, and Exporting Technique Layers
References
- attack.mitre.org
- attack.mitre.org
- medium.com
- ctid.mitre.org
- github.com
- github.com
- attack.mitre.org
- caldera.readthedocs.io
Get new drops in your inbox
Windows internals, exploit dev, and red-team write-ups — no spam, unsubscribe anytime.