OSINT for People and Credentials: LinkedIn, Breach Data, and Email Harvesting

Objective: Understand how adversaries assemble a pre-engagement targeting package — employee identities, email addresses, and exposed credentials — from public sources such as LinkedIn, breach databases, and email-discovery APIs, and learn the matching detection and hardening guidance that lets defenders run the same playbook against their own organization.

1. What OSINT Reconnaissance Is (and Isn’t)

Open-Source Intelligence (OSINT) is the collection and correlation of information from publicly available sources. In a red team context it forms the Reconnaissance phase that precedes any packet sent to the target.

The critical distinction is passive versus active:

Concept	What It Actually Is
Passive OSINT	Querying third-party databases, search engines, and public records. No packet ever reaches the target, so the target cannot detect you.
Active recon boundary	Direct interaction with target infrastructure — DNS zone transfers, port scans, banner grabbing. The target can log it.
Email format inference	Deriving a standard format from confirmed samples, then extrapolating across all discovered names.
Credential stuffing pipeline	Cross-referencing leaked credential databases against a domain to find reusable passwords for spraying or stuffing.

Everything in this tutorial is passive or queries third-party services — never the target. Even so, all activity must sit inside a signed rules of engagement (RoE) and scope document. You only run breach-domain searches and authenticated harvesting against domains you own or are explicitly authorized to test. Storing breach data carries legal weight; handle it like the regulated material it is.

2. The Adversary’s Goal: Building a Targeting Package

The output of this phase is a structured targeting package. A complete one contains:

Employee list — names, titles, departments, reporting structure.
Email addresses — confirmed or inferred from the corporate format.
Exposed credentials — breach hits tied to those addresses.
Tech stack — EDR, VPN, and cloud platforms gleaned from job postings.
Attack surface — subdomains and employee-facing portals.

This maps directly to ATT&CK Reconnaissance (TA0043): gathering identity information (T1589), org information (T1591), and searching open websites (T1593). The package’s value is leverage — it converts anonymous infrastructure into named humans with reusable passwords and a known authentication portal.

Flow diagram showing how LinkedIn harvesting, email inference, breach lookups, and certificate transparency logs feed into a unified targeting package that drives credential spraying and phishing. — All four OSINT streams converge into a single targeting package before any active exploitation begins.

3. LinkedIn People Harvesting

LinkedIn is the richest single source of employee identity data. Unauthenticated bulk scraping violates its Terms of Service, so red teams stick to passive search-engine methods.

The primary technique is Google dorking — crafted search queries that pull indexed profiles without touching LinkedIn directly:

# Run only against organizations you have written authorization to assess.
# Illustrative dork strings — patterns, not automated scrapers.

site:linkedin.com/in "Target Corp" "Security Engineer"
site:linkedin.com/in "Target Corp" "Cloud Administrator"

Beyond names and titles, job postings leak the tech stack. A listing requiring “experience with CrowdStrike Falcon” confirms the EDR platform; a VPN product name reveals the remote-access surface. Each discovered name feeds two downstream tasks: email-address derivation and lure crafting for later social engineering.

What an adversary derives from purely public profiles:

Technique	Description
Name and title harvesting	Build the employee roster and org chart.
Department structure mapping	Identify privileged roles (IT, finance, HR).
Tech-stack inference	Read EDR/VPN/cloud product names from job ads.
Movement tracking	Spot new hires (weaker awareness) and recent departures.

4. Email Harvesting with theHarvester

theHarvester is the canonical recon tool for this phase. It gathers names, emails, IPs, subdomains, and URLs from 40+ public resources, determining a domain’s external threat landscape without contacting the target.

theHarvester invocation:

# Authorized engagements only — run against domains in your signed scope.
theHarvester -d example-corp.com -b bing,linkedin,hunter -l 500 -f results.json

Flag breakdown:

Flag	Purpose
`-d <domain>`	Target domain to enumerate.
`-b <source>`	Comma-separated data sources (`bing`, `google`, `linkedin`, `hunter`, `censys`, `certspotter`, `shodan`).
`-l <limit>`	Cap on results retrieved per source.
`-f <file>`	Write structured output (JSON/XML) for later correlation.

Several sources — hunter, censys, shodan — require API keys configured in theHarvester’s api-keys.yaml. The output is a deduplicated set of email addresses, subdomains, and hostnames you carry forward into format inference and breach lookups.

5. Email Format Inference and Verification

A handful of confirmed addresses reveals the corporate email format. Extrapolate that pattern across the LinkedIn roster to generate every employee’s likely address.

The six dominant corporate archetypes:

Pattern	Example
`firstname.lastname`	`jane.doe@domain.com`
`firstnamelastname`	`janedoe@domain.com`
`flastname`	`jdoe@domain.com`
`firstname`	`jane@domain.com`
`f.lastname`	`j.doe@domain.com`
`firstname_lastname`	`jane_doe@domain.com`

Hunter.io automates detection: its domain-search endpoint returns a pattern field naming the format explicitly, plus per-address confidence scores.

# Authorized scope only. Requires a Hunter.io API key.
import requests

def hunter_domain_search(domain, api_key):
    url = "https://api.hunter.io/v2/domain-search"
    params = {"domain": domain, "api_key": api_key}
    r = requests.get(url, params=params, timeout=20)
    r.raise_for_status()
    data = r.json()["data"]

    print(f"[+] Detected format: {data.get('pattern')}")
    for e in data.get("emails", []):
        print(f"    {e['value']:35} confidence={e['confidence']}")

# hunter_domain_search("example-corp.com", "<API_KEY>")

Validate an inferred format passively by confirming sample addresses in breach databases (next section) rather than actively probing the target’s SMTP server.

6. Breach Data with Have I Been Pwned

Have I Been Pwned (HIBP) aggregates breach data from thousands of compromised databases. The v3 API is current; per-account and domain endpoints require the hibp-api-key header and a descriptive User-Agent.

Per-account breach lookup:

# Authorized accounts only (e.g., your own domain's mailboxes).
import requests

def hibp_account(account, api_key):
    url = f"https://haveibeenpwned.com/api/v3/breachedaccount/{account}"
    headers = {"hibp-api-key": api_key, "User-Agent": "RedTeam-Recon-Lab"}
    r = requests.get(url, headers=headers, params={"truncateResponse": "false"}, timeout=20)
    if r.status_code == 404:
        return []          # clean — no breaches
    r.raise_for_status()
    for b in r.json():
        severity = "HIGH" if "Passwords" in b["DataClasses"] else "INFO"
        print(f"[{severity}] {b['Name']} ({b['BreachDate']}) -> {b['DataClasses']}")
    return r.json()

Key breach-metadata fields: Name, BreachDate, DataClasses, IsVerified, and IsFabricated. Treat IsFabricated: true entries with caution — they may be unreliable.

The /breacheddomain/ endpoint searches an entire domain at once, but it requires a paid plan and verified domain ownership — by design, you can only run it against a domain you control. That same constraint makes it a legitimate blue-team monitoring tool.

Privacy-preserving password check (k-Anonymity):

The /range/ endpoint requires no API key and never sends the full hash. You SHA-1 the candidate password, send only the first 5 characters of the hash, and match the returned suffix list locally.

import hashlib, requests

def pwned_password(password):
    sha1 = hashlib.sha1(password.encode()).hexdigest().upper()
    prefix, suffix = sha1[:5], sha1[5:]
    r = requests.get(f"https://api.pwnedpasswords.com/range/{prefix}", timeout=20)
    r.raise_for_status()
    for line in r.text.splitlines():
        h, count = line.split(":")
        if h == suffix:
            return int(count)          # times seen in breaches
    return 0

The full password never leaves your machine — this is the model defenders should adopt for any internal password-exposure check.

7. Deeper Breach Intelligence: DeHashed, IntelligenceX, and Paste Sites

HIBP confirms that an account was breached; it does not return passwords. For credential investigation, red teams reach for paid platforms.

Service	What It Adds
DeHashed	Plaintext/hashed passwords, usernames, IPs tied to an email; lets you check whether the same hash recurs across accounts (reuse).
IntelligenceX	Indexes paste-site content and leak archives for near-real-time monitoring.
BreachDirectory	Ongoing credential-exposure tracking.
Pastebin / GitHub Gist	Credentials and internal data frequently surface here before removal.

If a target email appears in DeHashed with a known password, that password may have been reused on corporate VPNs, mail portals, or cloud consoles — the basis of the credential-stuffing pipeline. Accessing and storing this material carries real legal constraints: retain only what the engagement requires, encrypt it at rest, and destroy it per the RoE.

8. Certificate Transparency for Subdomain Enumeration

Every TLS certificate issued for a domain is logged in public Certificate Transparency (CT) logs. Querying them discovers subdomains that never appear in DNS brute-forcing — and crucially, this is passive: you query a third-party log, not the target.

# crt.sh CT-log query — passive subdomain enumeration.
import requests

def crtsh_subdomains(domain):
    r = requests.get(f"https://crt.sh/?q=%.{domain}&output=json", timeout=30)
    r.raise_for_status()
    subs = {row["name_value"] for row in r.json()}
    for s in sorted(subs):
        print(s)

# crtsh_subdomains("example-corp.com")

Discovered hosts like vpn.example-corp.com or mail.example-corp.com correlate back to the harvested employees — these are the portals where breach credentials get sprayed.

9. Correlating Findings into an Attack Path

Reconnaissance is only useful when chained. The logical flow:

People (LinkedIn) → roster of names and titles.
Email format (Hunter.io) → addresses for every name.
Breach hits (HIBP / DeHashed) → which addresses leaked, and which leaked passwords.
Portals (crt.sh) → where those credentials authenticate.
Spray candidates → privileged accounts without MFA, ranked by exploitability.

Two illustrative correlation helpers — dork construction and authorized format validation:

# Dork strings illustrate patterns only — no automated scraping.
linkedin = 'site:linkedin.com/in "TargetCorp" "engineer"'
github   = 'org:targetcorp filename:.env password'

# Authorized lab/own-domain only: generate candidates and check breach exposure.
def generate_and_check(names, domain, hibp_key):
    candidates = [f"{f.lower()}.{l.lower()}@{domain}" for f, l in names]
    for addr in candidates:
        hits = hibp_account(addr, hibp_key)   # from Section 6
        flag = "EXPOSED" if hits else "clean"
        print(f"{addr:35} {flag}")

Deliver the result as a structured artefact, not raw tool dumps:

# OSINT Targeting Report — example-corp.com (AUTHORIZED ENGAGEMENT)

## Employees Found
- Jane Doe — Security Engineer (LinkedIn)
- John Roe — Cloud Administrator (LinkedIn)

## Email Format
- Confirmed pattern: firstname.lastname@example-corp.com (Hunter.io, confidence 95)

## Breach Hits
- jane.doe@... — Breach2021 (Passwords, Emails) — HIGH
- john.roe@...  — no exposure — clean

## Credential Risk Ranking
1. jane.doe@... — admin role + breach password + portal vpn.example-corp.com

## Suggested Next Steps
- Validate MFA status on exposed accounts (authorized phase 2 only)

Sequential attack-chain diagram mapping LinkedIn people data through email format inference, breach credential lookups, and subdomain discovery to a final credential-spray attempt against discovered authentication portals. — The recon-to-attack chain converts public identity data into ranked spray candidates against real authentication portals.

10. Common Attacker Techniques

Technique	Description
Employee-name harvesting	Build rosters from LinkedIn and search engines to derive emails and lures.
Email-format inference	Extrapolate one confirmed format across the entire roster.
Breach-credential mining	Cross-reference addresses against HIBP/DeHashed for reusable passwords.
Paste-site monitoring	Scrape Pastebin/Gist leaks before takedown.
GitHub secret hunting	Search public repos and commit history for `.env` files, API keys, and DB passwords.
CT-log enumeration	Discover forgotten subdomains and shadow IT portals.

Git history is decisive: a secret deleted last month still lives in the commit log unless the repo was scrubbed with git filter-repo — most never are.

11. Defensive Strategies & Detection

Inbound passive OSINT is largely invisible — there is no packet to log. Defense is therefore exposure reduction plus detecting the downstream use of harvested data and any internal authorized tooling.

What is observable:

Sysmon Event ID 22 (DNSEvent) — internal hosts resolving OSINT API domains (hunter.io, haveibeenpwned.com). Field: QueryName. Relevant to authorized red-team logging, not inbound recon.
Sysmon Event ID 3 (NetworkConnect) — outbound connections to Shodan/Censys/harvesting endpoints. Fields: DestinationIp, DestinationPort, Image.
WAF / CDN logs — high-rate hits on /staff, /team, /about, /sitemap.xml and scraper user-agents.
Certificate Transparency monitoring — alerts when unexpected certs/subdomains appear (shadow IT or forgotten assets).
GitHub secret scanning — Advanced Security flags committed credentials before adversaries find them.

Downstream credential abuse is where SIEM earns its keep. Watch domain controllers for Event ID 4625 failures spread across many accounts from one source IP — SubStatus 0xC000006A (wrong password) and 0xC0000064 (bad username) signal password spraying. In Entra ID, alert on a successful sign-in from a new geolocation immediately after a domain appears in a breach.

Sigma rule (internal OSINT tool execution in a lab/red-team environment):

title: Internal OSINT Recon Tool Execution
logsource:
  product: windows
  service: sysmon
detection:
  selection:
    EventID: 1                 # Sysmon ProcessCreate
    Image|endswith:
      - '\theHarvester.py'
      - '\python.exe'
    CommandLine|contains:
      - 'theHarvester'
      - 'hunter.io'
      - 'haveibeenpwned'
  condition: selection
level: medium

This targets authorized internal tooling; it cannot see external recon performed against you.

Hardening priorities:

Mitigation	Description
Employee profile hygiene	Train staff not to list VPN/EDR/tooling names in LinkedIn bios.
Corporate email discipline	Forbid work email for personal SaaS — breaches of those services leak corporate credentials.
DMARC `p=reject`	Stops harvested addresses being trivially spoofed in follow-on phishing.
MFA everywhere	Neutralizes breached passwords; prioritize internet-facing admin panels.
GitHub secret scanning + pre-commit hooks	Block secrets at commit; audit history with `truffleHog` / `git-secrets`.
Periodic HIBP domain search	Verified-owner API run on a schedule; force resets on exposed accounts.

Blue teams should run this entire playbook against themselves — to find leaked credentials, spot typosquatting, identify unauthorized assets, and measure supplier exposure.

Hierarchy diagram splitting defensive strategy into three branches: exposure reduction, downstream detection via SIEM event IDs and Entra alerts, and hardening controls including universal MFA and DMARC enforcement. — Because inbound OSINT leaves no logs, defenders focus on shrinking exposure and detecting the downstream credential abuse it enables.

12. Tools for OSINT Reconnaissance

Tool	Description	Link
theHarvester	Multi-source email/subdomain/IP harvesting	github.com/laramies/theHarvester
Hunter.io	Email discovery + format detection API	hunter.io
Have I Been Pwned	Breach and password-exposure API (v3)	haveibeenpwned.com
DeHashed	Credential investigation (passwords, usernames)	dehashed.com
IntelligenceX	Paste-site and leak indexing	intelx.io
crt.sh	Certificate Transparency log search	crt.sh
truffleHog	Git history secret scanning	github.com

13. MITRE ATT&CK Mapping

All techniques sit under Reconnaissance (TA0043) except the downstream abuse rows.

Technique	MITRE ID	Detection
Gather Victim Identity Information	`T1589`	Largely undetectable inbound; reduce exposure.
…Credentials	`T1589.001`	HIBP/DeHashed exposure monitoring; force resets.
…Email Addresses	`T1589.002`	Hunter.io/theHarvester output review; verify ID at attack.mitre.org.
…Employee Names	`T1589.003`	Profile-hygiene training; LinkedIn monitoring.
Search Open Websites/Domains	`T1593`	WAF/CDN scraper detection.
…Social Media	`T1593.001`	Brand/impersonation monitoring.
…Search Engines	`T1593.002`	Dork-leak audits of own indexed content.
…Code Repositories	`T1593.003`	GitHub secret scanning.
Gather Victim Org Information	`T1591`	Public-footprint review.
Search Open Technical Databases	`T1596`	CT-log monitoring (crt.sh, Censys).
Compromise Accounts	`T1586`	Anomalous sign-in correlation.
Valid Accounts	`T1078`	MFA enforcement; 4625 spray detection (shifts to TA0001).

Summary

OSINT reconnaissance converts public data — LinkedIn profiles, breach dumps, and CT logs — into a targeting package of named employees with reusable credentials, all without sending a packet to the target.
Employee names drive email-format inference; Hunter.io’s pattern field and theHarvester’s multi-source output extrapolate addresses across an entire org.
HIBP confirms exposure (use the keyless k-Anonymity /range/ endpoint for safe password checks); DeHashed and paste sites supply the actual reusable passwords.
The attack path chains people → emails → breach credentials → discovered portals → MFA-less spray candidates — mapped to ATT&CK T1589, T1593, and downstream T1586/T1078.
Defenders detect the downstream abuse — Event ID 4625 spray patterns, anomalous Entra sign-ins — and shrink exposure with DMARC p=reject, universal MFA, GitHub secret scanning, and authorized HIBP domain searches.

References

Active OSINT: DNS, Certificate Transparency, and Subdomain Enumeration

Objective: Understand how an authorized red teamer methodically maps an organization’s external DNS attack surface — from zero-noise passive Certificate Transparency mining to active brute-force resolution — and how defenders detect each technique at the protocol, log, and SIEM level.

1. Why Subdomain Enumeration Matters: The Attack Surface Problem

An organization’s externally reachable footprint is rarely the handful of hostnames it advertises. Missed subdomains mean missed attack surface: forgotten admin panels, staging environments, internal APIs accidentally exposed, and legacy services that were never meant to be public. Each undiscovered host is a node the defender is not monitoring and the operator can pivot through.

Enumeration is a multi-source intelligence-gathering process, not a single tool run. A mature workflow combines passive aggregation, public technical databases, and active resolution to build the most complete asset inventory possible. The skill is sequencing those techniques from quietest to loudest so the operator controls exactly how much signal they generate.

All techniques below fall under MITRE’s Reconnaissance tactic (TA0043). Run them only inside an authorized scope.

2. DNS Primer for Red Teamers: Records, Zones, and Resolvers

DNS resolution flows through a chain: a recursive resolver queries the root, then the TLD nameservers, then the authoritative NS for the zone. The authoritative server holds the records that matter to recon. Each record type leaks distinct intelligence.

Record	Function
`A` / `AAAA`	IPv4 / IPv6 address mapping for a hostname
`CNAME`	Canonical name alias — critical for subdomain takeover identification
`MX`	Mail exchange — reveals mail infrastructure and phishing pivot targets
`NS`	Authoritative nameserver — identifies zone ownership and AXFR targets
`TXT`	Freeform text — SPF (`v=spf1`), DKIM, DMARC (`v=DMARC1`), verification tokens often expose third-party services
`SOA`	Start of Authority — primary NS, contact email, serial, refresh, retry, expire, minimum TTL
`PTR`	Reverse DNS — maps IP → hostname, used in reverse-range sweeps
`SRV`	Service locator — reveals app-layer services (`_ldap._tcp`, `_sip._tcp`)

Enumerate record types directly with dig:

dig A target.com +short
dig NS target.com +short
dig MX target.com +short
dig TXT target.com +short          # SPF/DMARC reveal third-party SaaS
dig SOA @ns1.target.com target.com

TXT recon is high-value: SPF includes (include:_spf.salesforce.com) and verification tokens fingerprint exactly which cloud and SaaS providers an organization uses.

3. Zone Transfer Attacks (AXFR/IXFR): When DNS Gives It All Away

A zone transfer exists so a secondary nameserver can replicate a zone from the primary. A full transfer is DNS query type AXFR; an incremental transfer is IXFR. If an authoritative server answers an AXFR from an unauthorized client, it dumps the entire zone — every record, in one transaction.

dig axfr @ns1.target.com target.com

A correctly hardened server returns Transfer failed. or a refusal. A misconfigured one returns the full record set. dnsrecon automates the test across all discovered nameservers:

dnsrecon -d target.com -t axfr

Most modern configurations restrict AXFR to whitelisted secondary IPs, so success is rare — but the cost of the check is one query, and a hit collapses the entire enumeration phase into a single response.

4. Certificate Transparency: The Unintentional Subdomain Registry

Certificate Transparency (CT), defined in RFC 6962, is an open framework of public append-only logs recording every certificate issued by publicly trusted CAs. Browsers require that each certificate be logged to at least two CT logs before they accept it. The side effect: a comprehensive, searchable record of every subdomain any certificate ever covered.

Two fields carry the intelligence: the Common Name (CN) and the Subject Alternative Names (SANs). SANs are the modern standard for declaring which domains a certificate covers, and a single certificate can list dozens of subdomains. crt.sh exposes both through its name_value field.

Query the JSON API with a % wildcard prefix and extract uniques:

import requests

def crtsh_subdomains(domain):
    url = f"https://crt.sh/?q=%.{domain}&output=json"
    r = requests.get(url, timeout=30)
    subs = set()
    for entry in r.json():
        for name in entry["name_value"].splitlines():
            subs.add(name.lstrip("*.").lower())   # strip wildcard prefix
    return sorted(subs)

for s in crtsh_subdomains("target.com"):
    print(s)

For large zones, query the backing PostgreSQL database directly — faster and not rate-limited like the web frontend:

import psycopg2

conn = psycopg2.connect(host="crt.sh", port=5432, dbname="certwatch", user="guest")
cur = conn.cursor()
cur.execute("""
    SELECT ci.NAME_VALUE FROM certificate_identity ci
    WHERE ci.NAME_TYPE = 'dNSName'
    AND reverse(lower(ci.NAME_VALUE)) LIKE reverse(lower(%s));
""", ("%.target.com",))

subs = {row[0].lstrip("*.").lower() for row in cur.fetchall()}
print("\n".join(sorted(subs)))

NAME_TYPE = 'dNSName' filters to DNS SANs only. Other CT aggregators include Censys (search.censys.io), Facebook CT (developers.facebook.com/tools/ct/), and the Google Transparency Report. CT logs ingest within minutes of issuance; crt.sh and Certspotter typically surface new certificates within a few hours.

Flow diagram showing how a certificate request travels from an organization through a CA into a public CT log, gets indexed by aggregators like crt.sh, and is queried by both red teamers harvesting subdomains and defenders receiving Certspotter alerts — CT logs are public by design — every certificate issuance becomes a permanent, searchable record that attackers mine for subdomain discovery and defenders monitor for unauthorized issuance.

5. WHOIS, RDAP, and ASN Enumeration: Mapping the IP Estate

WHOIS data is held by Regional Internet Registries (RIRs) responsible for allocating domain names and IP resources. RDAP (Registration Data Access Protocol, RFC 7480) is the modern JSON-based successor. Both reveal registrar, creation/expiry dates, nameservers, and registrant organization.

whois target.com                  # registrar, NS, creation date, registrant org
curl -s https://rdap.verisign.com/com/v1/domain/target.com | jq '.nameservers, .entities'

The entities and nameservers arrays in RDAP output map cleanly to the org and infrastructure you correlate elsewhere. From the registrant org you pivot to ASN enumeration via RIPE/ARIN to discover owned IP blocks, then run reverse PTR sweeps across those ranges to recover hostnames not present in any forward record.

6. Passive DNS Aggregation: Intelligence Without Touching the Target

Passive DNS datasets store historical resolution data harvested by third parties. Querying them yields subdomains without your operator ever touching the target’s infrastructure — zero target-side signal.

Tool	Role
`subfinder`	Passive OSINT aggregator across CT logs, passive DNS, APIs
`amass` (`enum`)	Deep multi-source enumeration; passive mode plus ASN enumeration
`theHarvester`	OSINT gathering for emails, names, subdomains, IPs, URLs from public sources
`bbot`	Recon framework that correlates infrastructure relationships, not just names

Primary data sources include PassiveTotal/RiskIQ, VirusTotal, SecurityTrails, Shodan, and Censys. Most require API keys configured in the tool’s provider file.

subfinder -d target.com -all -o subs_passive.txt
amass enum -passive -d target.com -o subs_amass.txt
theHarvester -d target.com -b crtsh,bing,duckduckgo

amass is often misunderstood but offers unmatched depth when configured correctly; its passive mode remains a valid quiet alternative to active collection.

7. Active DNS Brute-Force: Wordlists, Resolvers, and Wildcard DNS

Active techniques directly interact with the target’s DNS infrastructure. The core mechanic: iterate a wordlist, prepend each word as a label (dev.target.com), issue an A/AAAA query, and record responses.

Tool	Primary Mechanic
`massdns`	High-throughput async resolver via custom resolver list
`puredns`	`massdns` wrapper with wildcard detection and deduplication
`shuffledns`	`massdns` brute-forcer with valid-resolver shuffling
`dnsx`	DNS probing and record-type enumeration
`gobuster dns`	Wordlist DNS brute force
`dnsenum`	Zone transfer attempts plus brute-force

The critical hazard is wildcard DNS: if *.target.com resolves to a catch-all IP, every guess returns a positive. Tools must detect and filter this. puredns handles wildcard detection and deduplication natively:

puredns bruteforce wordlist.txt target.com \
  -r resolvers.txt -w resolved.txt

Resolver selection matters — use a curated list of validated public resolvers (e.g., trickest/resolvers) so queries distribute and stay accurate. Wordlists drive coverage: SecLists dns-Jhaddix.txt and Commonspeak2 are standard. Distributing queries across many resolvers also smears per-source detection thresholds.

8. Permutation and Mutation: Finding What Brute-Force Misses

Brute-force only finds words in your list. Permutation generates variants of already-discovered subdomains — taking api and producing api-dev, api-v2, api-staging, internal-api. altdns and dnsgen perform this mutation.

PATTERNS = ["dev", "staging", "prod", "v2", "internal", "test"]

def mutate(known_subs, base):
    out = set()
    for host in known_subs:
        label = host.replace(f".{base}", "")
        for p in PATTERNS:
            out.add(f"{label}-{p}.{base}")   # api -> api-dev.target.com
            out.add(f"{p}-{label}.{base}")   # api -> dev-api.target.com
    return out

# feed mutations back into dnsx for resolution

Pipe the generated candidates straight into dnsx to resolve only the survivors. Permutation routinely surfaces staging hosts that follow internal naming conventions no public wordlist contains.

9. Chaining It Together: A Full Enumeration Workflow

The value is in the pipeline. Aggregate names, resolve them, probe live services, then validate. Each stage adds a column of intelligence:

subfinder -d target.com -o subs.txt                       # passive aggregation
dnsx -l subs.txt -a -resp -o resolved.txt                 # keep only resolvers
httpx -l resolved.txt -title -status-code -tech-detect \
      -o live.txt                                          # live HTTP fingerprint

subfinder supplies the candidate set, dnsx discards dead names and records the answers, and httpx confirms which hosts serve HTTP, their titles, status codes, and detected technologies. Downstream, aquatone or gowitness screenshot each live host for triage at scale, and subjack checks for takeover. CT logs and passive DNS feed the top of the funnel; active brute-force and permutation widen it; HTTP probing and screenshotting prioritize what to investigate.

Flow diagram showing the full subdomain enumeration pipeline from passive CT logs and passive DNS through active brute-force and permutation, into DNS resolution, HTTP probing, and final triage and takeover checks — The enumeration pipeline sequences quiet passive sources first, then progressively louder active techniques, before filtering to live hosts for prioritized investigation.

10. Subdomain Takeover: From Dangling CNAME to Claimed Asset

Enumeration frequently uncovers dangling CNAMEs — a subdomain whose CNAME points to a deprovisioned cloud service (GitHub Pages, Heroku, AWS S3, Azure, Fastly). If the operator can re-register that external resource, they serve content from the victim’s trusted subdomain. This is the primary takeover vector.

subjack fingerprints CNAME chains against known-vulnerable service responses:

subjack -w resolved.txt -t 100 -timeout 30 \
        -c fingerprints.json -v

A positive result means a subdomain’s CNAME chain terminates at an unclaimed external resource. In an authorized engagement, validate the finding against the can-i-take-over-xyz reference list and report it through responsible disclosure — do not claim the resource unless the rules of engagement explicitly permit proof-of-concept takeover.

11. Common Attacker Techniques

Technique	Description
Zone transfer (AXFR)	Dump an entire zone from a misconfigured authoritative NS in one query
CT log mining	Harvest CN/SAN fields to recover the full historical subdomain namespace
Passive DNS query	Recover subdomains from third-party resolution history with zero target contact
DNS brute-force	Resolve a wordlist of guessed labels against the target’s resolvers
Permutation mutation	Generate naming variants of known hosts to find staging/internal services
Reverse PTR sweep	Map owned ASN/IP blocks back to hostnames
Subdomain takeover	Claim a deprovisioned cloud resource behind a dangling CNAME

The progression matters operationally: CT logs, WHOIS/RDAP, and passive DNS generate zero target-side signal, while AXFR, brute-force, and HTTP probing are increasingly noisy and detectable.

Hierarchy diagram splitting subdomain reconnaissance techniques into passive zero-signal methods (CT log mining, WHOIS/RDAP, passive DNS) and active detectable methods (AXFR, DNS brute-force, HTTP probing) with MITRE ATT&CK technique IDs — Passive techniques leave no trace on target infrastructure, while active techniques generate NXDomain spikes, AXFR refusals, and HTTP access-log entries that defenders can detect.

12. Defensive Strategies & Detection

CT mining, WHOIS/RDAP, and passive DNS queries occur entirely outside the target’s infrastructure and generate no SIEM-visible events at collection time. Detection therefore concentrates on the active phases.

Activity	Signal Generated
AXFR attempt	Single large TCP/53 transaction to authoritative NS; refusals still log
DNS brute-force	High-volume `NXDomain` responses from one source IP in a short window
CT / WHOIS / passive DNS	None — third-party or public registry
Active resolution (`massdns`)	High `NXDomain` rate; resolver-distributed queries may evade per-source detection
HTTP probing (`httpx`)	Web server access logs; WAF hits on rapid host sweeps

Sysmon and ETW

Sysmon Event ID 22 (DNSEvent) logs DNS queries made through the Windows DnsQuery_* API calls in dnsapi.dll, supported on Windows 8.1 and above via ETW. This catches recon tooling run from a compromised Windows host, recording QueryName, QueryStatus, and QueryResults. The underlying provider is Microsoft-Windows-DNS-Client (GUID {1C95126E-7EEA-49A9-A3FE-A378B03DDB4D} — verify against current Windows documentation).

Network and Resolver-Side Detection

Flag source IPs generating more than N NXDomain responses per minute; brute-force tools generate hundreds per second.
DNS Response Policy Zones (RPZ) and authoritative server logs capture all inbound queries, including refused AXFR attempts.
Restrict AXFR with allow-transfer (BIND) or transfer ACLs (Windows DNS Server) to whitelisted secondaries only.
Enable Response Rate Limiting (RRL) to slow brute-force resolution.

Sigma Rule (DNS brute-force via Sysmon EID 22)

title: DNS Subdomain Brute-Force (High NXDomain Rate)
logsource:
  product: windows
  category: dns_query          # maps to Sysmon EventID 22
detection:
  selection:
    QueryStatus: 'NXDOMAIN'    # DNS_ERROR_RCODE_NAME_ERROR (9003)
  condition: selection | count() by SourceIp > 200 within 1m
fields:
  - QueryName
  - QueryStatus
  - QueryResults
  - Image
level: medium

CT Log Monitoring (Defensive)

Defenders can flip CT against the attacker: subscribe to Certspotter (SSLMate), crt.sh alerts, or the Facebook CT monitoring API to receive near-real-time alerts on certificates newly issued for your domain tree. Combined with regular self-enumeration to detect unauthorized subdomain creation, dangling-CNAME audits, and accurate published SPF/DMARC/DKIM TXT records, this closes most of the gaps recon exploits.

13. Tools for Subdomain Enumeration Analysis

Tool	Description	Link
`dig` / `dnsrecon`	Record enumeration and AXFR testing	—
`crt.sh`	Certificate Transparency search and JSON/PostgreSQL API	`crt.sh`
`subfinder`	Passive multi-source subdomain aggregation	`github.com`
`amass`	Deep enumeration plus ASN mapping	`github.com`
`puredns` / `massdns`	Wildcard-aware high-throughput brute-force	`github.com`
`dnsx` / `httpx`	Resolution and live HTTP probing	`github.com`
`theHarvester`	OSINT email/host/IP gathering	`github.com`
`subjack`	Subdomain takeover fingerprinting	`github.com`
`Censys` / `Shodan`	Internet-wide scan and certificate databases	`search.censys.io`
`Certspotter`	Defensive CT certificate monitoring	`sslmate.com`

14. MITRE ATT&CK Mapping

Technique	MITRE ID	Detection
Active Scanning	`T1595`	High `NXDomain` rate; resolver and firewall logs
Active Scanning: Scanning IP Blocks	`T1595.001`	Reverse PTR sweeps across ASN ranges
Gather Victim Network Information	`T1590`	Umbrella — DNS/network infrastructure gathering
Gather Victim Network Information: DNS	`T1590.002`	AXFR attempts logged at authoritative NS
Search Open Technical Databases	`T1596`	No target-side signal; out-of-band collection
Open Technical Databases: DNS/Passive DNS	`T1596.001`	Third-party passive DNS — no local visibility
Open Technical Databases: WHOIS	`T1596.002`	Public registry query — no local visibility
Open Technical Databases: Scan Databases	`T1596.005`	CT log / Shodan / Censys mining; verify against live ATT&CK page

All map to Reconnaissance (TA0043). The defining split: T1595 is active and detectable, while the T1596 family is passive and invisible to the target at collection time.

Summary

External DNS attack surface is far larger than what an organization advertises, and missed subdomains are missed attack surface.
DNS records, AXFR misconfigurations, and Certificate Transparency CN/SAN fields each leak distinct, attack-relevant intelligence about hosts and infrastructure.
Passive sources (CT logs, WHOIS/RDAP, passive DNS) generate zero target-side signal; active brute-force and HTTP probing are detectable through high NXDomain rates and access logs.
Detect active recon via Sysmon Event ID 22 DNS query logging, resolver NXDomain rate thresholds, and RPZ/AXFR refusal logs.
Defend by restricting AXFR, removing dangling CNAMEs, rate-limiting resolvers, and monitoring your own domains in CT logs with Certspotter for near-real-time certificate alerts.

References

Passive OSINT: Mapping the Target Without Touching It

Objective: Understand how authorized red teamers and defenders build a complete external attack-surface picture of an organization using only public, third-party data sources — generating zero packets to target systems — and how defenders run the same exercise against themselves to shrink that exposure.

1. What “Passive” Actually Means

Passive reconnaissance never interacts with the target’s own infrastructure. Every byte you read comes from a third-party aggregator — a registrar’s WHOIS server, a certificate transparency log, Shodan’s index, a breach database, a search-engine cache. The target’s web servers, DNS resolvers, and firewalls log nothing, because you never send them anything. That property is the entire point: passive OSINT leaves no forensic trail on the defender’s systems.

This contrasts directly with Active Scanning (T1595), where you resolve hostnames against the target’s authoritative nameservers, fingerprint services, or port-scan a CIDR. Active scanning touches the target and is logged. T1595 is explicitly out of scope here — it is the technique that begins the moment passive recon ends.

Authorization first. Run these techniques only against organizations you are contractually authorized to assess, within a signed Rules of Engagement (RoE) and defined scope. Querying public databases is legal in most jurisdictions, but acting on harvested credentials or accessing exposed services is not — that is active intrusion, governed by your authorization.

All techniques below map to MITRE ATT&CK Tactic: Reconnaissance (TA0043).

2. The OSINT Intelligence Cycle

Unstructured “Googling the company” wastes time and produces noise. Disciplined OSINT follows a repeatable cycle driven by intelligence requirements defined before any tool runs.

Phase	Activity
Planning	Define intelligence requirements: what assets, people, or exposures matter to the engagement
Collection	Gather raw data from open sources (CT logs, Shodan, DNS, dorks)
Processing	Clean and normalize results, deduplicate, validate sources
Analysis	Link normalized data to the target to determine if an exposure is reachable
Dissemination	Route findings to stakeholders with remediation steps
Continuous Monitoring	Automate the cycle for ongoing exposure enrichment

Below is the full passive source landscape this tutorial works through.

Source Category	Tool / Service	What It Yields
DNS & WHOIS	`dig`, `host`, SecurityTrails	Registrar, nameservers, mail providers, subdomains
Certificate Transparency	`crt.sh`, CertSpotter	Every issued cert — forgotten dev/staging subdomains
Passive DNS	SecurityTrails, CIRCL pDNS	Historical domain-to-IP relationships over time
Scan Databases	Shodan, Censys, ZoomEye	Indexed service banners, open ports, product versions
Search Dorking	Google, Bing (GHDB)	Exposed panels, config files, directory listings
Code Repositories	GitHub, GitLab	Internal hostnames, tooling, leaked secrets
Social / HUMINT	LinkedIn, job boards	Org structure, tech stack, key personnel
Breach Databases	HIBP, DeHashed	Exposed employee credentials
Web Archives	Wayback Machine	Old endpoints and removed infrastructure
BGP / ASN	BGPView, RIPE, ARIN	ASN, owned prefixes, upstream providers
Cloud / Shadow IT	GrayhatWarfare	Exposed S3/Azure/GCP buckets

Circular flow diagram showing the six-stage passive OSINT intelligence cycle: Plan, Collect, Process, Analyze, Disseminate, Monitor, looping back to Plan — Disciplined OSINT follows a repeatable intelligence cycle — starting with defined requirements before any tool runs.

3. Domain & DNS Reconnaissance

Start with the apex domain. WHOIS/RDAP reveals registrar, registration dates, and (where not privacy-protected) ownership contacts. DNS record enumeration against public resolvers — not the target’s nameservers — exposes mail providers, CDN usage, and SPF/DMARC posture.

# Enumerate core DNS records via a public resolver (no packets to the target)
for rr in A MX NS TXT SOA; do
  echo "== $rr =="
  dig +short @1.1.1.1 example.com $rr
done

# MX reveals the mail provider; TXT reveals email-auth posture
dig +short example.com MX          # e.g. *.mail.protection.outlook.com -> M365
host -t TXT _dmarc.example.com      # p=none vs p=reject tells you spoofability

An MX pointing to mail.protection.outlook.com identifies Microsoft 365; an SPF record ending in ~all instead of -all, or a missing DMARC policy, flags mail-spoofing potential. This covers Domain Properties (T1590.001) and DNS (T1590.002).

4. Certificate Transparency & Subdomain Enumeration

Under RFC 6962, every publicly trusted CA logs each certificate it issues to append-only, monitorable CT logs. That means every SSL certificate ever issued for a domain is searchable — including certs for staging, dev, and legacy VPN subdomains defenders forgot existed.

# Pull every CT-logged cert for *.example.com and extract unique hostnames
curl -s 'https://crt.sh/?q=%25.example.com&output=json' \
  | jq -r '.[].name_value' \
  | sed 's/\*\.//g' \
  | sort -u

Aggregators wrap CT and dozens of other passive feeds. Run them in passive mode so they never resolve against the target:

# Passive mode: third-party data sources only, no resolution against the target
amass enum -passive -d example.com -o subs.txt

# Contrast: 'amass enum -active' resolves and brute-forces -> NOT passive, out of scope

Forgotten subdomains like legacy-vpn.example.com or jenkins-dev.example.com are gold: they often run unpatched software outside the patch-management lifecycle. This is Digital Certificates (T1596.003).

5. Internet-Wide Scan Databases: Shodan & Censys

Shodan and Censys continuously crawl the entire IPv4 space and index the banner metadata that devices return on open ports — web servers, routers, databases, ICS/OT, cloud instances. Querying their index is fully passive: they touched the target months ago; you only read the cache.

import shodan

api = shodan.Shodan("YOUR_API_KEY")          # querying Shodan's index, not the target
results = api.search('org:"Example Corp"')

print(f"Total results: {results['total']}")
for host in results["matches"][:25]:          # respect plan rate limits
    ip   = host["ip_str"]
    port = host["port"]
    prod = host.get("product", "")
    print(f"{ip}:{port}\t{prod}")

Pivot with filters: org:, asn:AS64500, port:3389, product:Elasticsearch. Exposed RDP (3389), unauthenticated Elasticsearch (9200), and VPN gateway banners directly enumerate Software (T1592.002), Hardware (T1592.001), and Network Security Appliances (T1590.006). Cross-reference Shodan IPs with your CT-derived subdomains to attach service data to named hosts. This is Scan Databases (T1596.005).

6. Search Engine Dorking (Google Hacking)

Search operators surface content the target inadvertently exposed and the engine indexed. The Google Hacking Database (GHDB) catalogs thousands of proven patterns. Use these manually in a browser — automated scraping violates ToS and risks blocking.

Dork	What It Finds
`site:example.com filetype:pdf`	Public documents (then mine metadata)
`site:example.com intitle:"index of"`	Open directory listings
`site:example.com inurl:admin`	Login / admin panels
`site:example.com filetype:env OR filetype:cfg`	Exposed config files
`site:example.com intext:"sql syntax near"`	Error messages leaking internals

Combining site: with intitle:, inurl:, and filetype: is remarkably effective. Bing serves as a secondary index that sometimes retains content Google dropped. This covers Search Engines (T1593.002) and Search Victim-Owned Websites (T1594).

7. Code Repository Mining

Public repositories leak internal hostnames, tooling, and — too often — live secrets. Search GitHub/GitLab for the org name, email domains, and internal hostnames discovered earlier. For your own repositories, run secret scanners in CI:

# Audit your OWN org's repos for committed secrets (defensive use)
trufflehog github --org=example-corp --only-verified
gitleaks detect --source . --report-format sarif --report-path leaks.sarif

Commit history and job postings reveal the technology stack. This is Code Repositories (T1593.003).

8. Organizational & Personnel Intelligence

LinkedIn is the most complete public database of an organization’s employees — org structure, reporting lines, and the technology stack advertised in job postings. Combine with theHarvester and Hunter.io to derive the email-address convention (first.last@), feeding social-engineering target lists.

Public documents carry metadata that maps directly to usernames and software versions:

import subprocess, json

out = subprocess.run(
    ["exiftool", "-j", "report.pdf"], capture_output=True, text=True
).stdout
meta = json.loads(out)[0]

for f in ("Creator", "Author", "LastModifiedBy", "Producer"):
    if f in meta:
        print(f"{f}: {meta[f]}")   # e.g. Author: jsmith / Producer: Acrobat 15.0

This covers Determine Physical Locations (T1591.001), Identify Roles (T1591.004), Employee Names (T1589.003), and Email Addresses (T1589.002).

9. Breach Data & Credential Exposure

Have I Been Pwned, DeHashed, and credential-log collections reveal when employee credentials have been exposed in third-party breaches — frequently before those credentials are weaponized in credential-stuffing.

import requests

domain  = "example.com"
headers = {"hibp-api-key": "YOUR_API_KEY", "user-agent": "authorized-recon"}
url     = f"https://haveibeenpwned.com/api/v3/breacheddomain/{domain}"

for alias, breaches in requests.get(url, headers=headers).json().items():
    print(alias, "->", ", ".join(breaches))

Report the breach name, date, and exposed data classes (passwords, hashes, MFA seeds). Never reuse harvested credentials outside RoE scope — possession is recon; authentication is intrusion. This is Credentials (T1589.001).

10. BGP, ASN & IP Range Mapping

To bound the network footprint, resolve a known IP to its origin ASN, then enumerate every prefix that ASN announces. This delineates owned IP space, co-location, and cloud presence — without scanning a single host.

# 1. Resolve an IP to its origin ASN (Team Cymru WHOIS)
whois -h whois.cymru.com " -v 203.0.113.10"

# 2. Enumerate prefixes announced by that ASN
whois -h whois.radb.net -- '-i origin AS64500' | grep -E '^route:'

# 3. Or use the BGPView API for prefixes + upstreams
curl -s https://api.bgpview.io/asn/64500/prefixes | jq -r '.data.ipv4_prefixes[].prefix'

This maps Network Trust Dependencies (T1590.003), IP Addresses (T1590.005), and Network Topology (T1590.004).

11. Correlating the Picture: Building a Target Profile

The real power emerges when you correlate intelligence across platforms — joining a CT-derived subdomain to a Shodan banner to an ASN prefix to a breached employee account reveals patterns invisible in any single source. Document findings in a structured, repeatable report.

# OSINT Target Profile — <Engagement ID>
## 1. Scope & Authorization        # RoE ref, authorized domains/ASNs, date window
## 2. Domain & DNS                 # registrar/RDAP, NS, MX, SPF/DKIM/DMARC posture
## 3. Subdomains (CT + passive)    # host | source | state (live / parked / dev)
## 4. Exposed Services (Shodan)    # ip:port | product | version | notes
## 5. Network Footprint            # ASN | prefixes | hosting / cloud providers
## 6. Personnel & Org              # key roles | tech stack | SE surface
## 7. Credential Exposure          # breach | date | data classes | accounts
## 8. Risk Summary & Recommendations

Correlation across sources exposes attack paths invisible in any single dataset — a forgotten subdomain linked to a Shodan banner and a breached credential is a reachable exploit chain.

12. Common Attacker Techniques

Technique	Description
CT-log subdomain harvesting	Mine `crt.sh` for forgotten dev/staging/VPN hosts
Passive DNS pivoting	Use historical IP↔domain data to map shared infrastructure
Shodan/Censys banner mining	Identify exposed RDP, databases, and VPN gateways
Google dorking	Surface exposed configs, panels, and error leaks
Repo secret mining	Recover API keys and hostnames from public commits
LinkedIn org mapping	Build personnel and tech-stack intelligence for phishing
Breach-data correlation	Match exposed credentials to active employee accounts
Document metadata extraction	Derive usernames/software from public PDFs and DOCX

These feed downstream Initial Access — phishing the personnel map, password-spraying the breach list, or exploiting the exposed service — all of which occur after passive recon and are logged.

13. Defensive Strategies & Detection

Framing: True passive OSINT generates no logs on your systems — the adversary only queries third-party databases. Defense therefore shifts to attack-surface reduction and detecting downstream use of harvested intelligence, not the recon itself.

What Defenders Can Detect (Indirect Signals)

Signal	Mechanism	Notes
New certificate issuance	CT monitors: CertSpotter, `crt.sh` alerts	Subscribe to alerts for new certs on your domains — proactive
Shodan/Censys indexing	Not real-time; scanner IP ranges are published	Block known scanner ranges to reduce exposure
Downstream credential use	Windows Security `EventID 4625` / `4624` / `4648`	HIBP-known creds appearing in auth logs = stuffing/breach
Leaked secrets	GitHub secret scanning, `truffleHog`/`gitleaks` in CI	Detect before attackers do
Active DNS recon	Defender for Identity `DnsReconnaissanceSecurityAlert`	Catches active DNS recon — not passive external OSINT

Sigma Sketch — Downstream Credential Use

title: Possible Use of OSINT-Harvested Credentials
logsource:
  product: windows
  service: security
detection:
  selection:
    EventID:
      - 4625   # Failed logon (credential stuffing)
      - 4648   # Explicit credential logon (harvested creds / PtH)
      - 4768   # Kerberos AS-REQ with harvested identity
  timeframe: 10m
  condition: selection | count() by SourceIp > 20
level: high

(Add environment-specific thresholds and allow-lists before deployment.)

Hardening / Attack Surface Reduction

Mitigation	Description
Prune DNS & avoid telling names	Purge stale records; don’t name hosts `staging-db.example.com`
Wildcard certificates	Reduce per-subdomain CT exposure
CT + brand monitoring	Alert on new subdomains, certs, and leaked references
Email auth hardening	Enforce SPF `-all`, DKIM, and DMARC `p=reject`
Repo secret scanning	Enable GitHub push protection; run `gitleaks` in CI
Monthly Shodan/Censys review	Audit your own ASN; remediate unexpected ports
HIBP Domain Search	Enroll in breach-notification API alerts
Metadata stripping	`exiftool -all= file.pdf` before publishing
RDAP/registrar privacy	Reduce WHOIS exposure where legally permissible
Policy review	Curb LinkedIn oversharing; manage domain lifecycle

A defender running this exact exercise against their own organization has a structural advantage: OSINT needs no change-approval window because it touches no production systems, so perimeter assessment carries zero operational impact.

Hierarchy diagram splitting passive OSINT defense into two pillars: Attack Surface Reduction (CT monitoring, DMARC, secret scanning, metadata stripping) and Downstream Detection (Windows EventID 4625 and 4648) — Because passive recon leaves no logs on your systems, defense pivots to reducing exposed surface area and detecting when harvested intelligence is weaponized downstream.

14. Tools for OSINT Analysis

Tool	Description	Link
Amass	Passive subdomain enumeration & mapping	owasp.org
subfinder	Fast passive subdomain discovery	projectdiscovery.io
theHarvester	Emails, names, subdomains from public sources	github.com
Shodan / Censys	Internet-wide scan databases	shodan.io
Recon-ng / SpiderFoot	Modular OSINT automation frameworks	spiderfoot.net
crt.sh	Certificate transparency search	crt.sh
SecurityTrails	Passive DNS & historical records	securitytrails.com
Have I Been Pwned	Breach & credential exposure	haveibeenpwned.com
truffleHog / gitleaks	Repo secret scanning	github.com
exiftool	Document metadata extraction	exiftool.org
BGPView	ASN & prefix enumeration	bgpview.io
Wayback Machine	Historical web snapshots	archive.org

15. MITRE ATT&CK Mapping

All techniques fall under Reconnaissance (TA0043). T1595 Active Scanning is out of scope.

Technique	MITRE ID	Detection / Reduction
Gather Victim Network Information	`T1590` (.001–.006)	Prune DNS; reduce WHOIS/CT exposure
Gather Victim Org Information	`T1591` (.001–.004)	LinkedIn-oversharing policy
Gather Victim Host Information	`T1592` (.001–.004)	Monthly Shodan/Censys self-audit
Search Open Websites/Domains	`T1593` (.001–.003)	Repo secret scanning; dork your own site
Search Victim-Owned Websites	`T1594`	Remove exposed configs/listings
Search Open Technical Databases	`T1596` (.001–.005)	CT monitoring; passive-DNS hygiene
Gather Victim Identity Information	`T1589` (.001–.003)	HIBP alerts; auth monitoring (4625/4648)

Summary

Passive OSINT maps an organization’s entire external attack surface using only third-party data, generating zero packets — and therefore zero logs — on the target.
The disciplined intelligence cycle (plan → collect → process → analyze → disseminate → monitor) turns scattered searches into a correlated target profile across DNS, CT logs, scan databases, repos, personnel, and breach data.
Correlation is the multiplier: joining a forgotten subdomain to a Shodan banner to a breached credential reveals reachable exposure invisible in any single source.
Because passive recon is undetectable on your systems, defense means attack-surface reduction — CT monitoring, DMARC p=reject, secret scanning, metadata stripping — plus detecting downstream credential use via Windows EventID 4625/4648.
All techniques map to ATT&CK Reconnaissance (TA0043); the boundary is T1595 Active Scanning, which begins the moment you touch the target directly.

OSINT for People and Credentials: LinkedIn, Breach Data, and Email Harvesting

1. What OSINT Reconnaissance Is (and Isn’t)

2. The Adversary’s Goal: Building a Targeting Package

3. LinkedIn People Harvesting

4. Email Harvesting with theHarvester

theHarvester invocation:

5. Email Format Inference and Verification

6. Breach Data with Have I Been Pwned

Per-account breach lookup:

Privacy-preserving password check (k-Anonymity):

7. Deeper Breach Intelligence: DeHashed, IntelligenceX, and Paste Sites

8. Certificate Transparency for Subdomain Enumeration

9. Correlating Findings into an Attack Path

10. Common Attacker Techniques

11. Defensive Strategies & Detection

Sigma rule (internal OSINT tool execution in a lab/red-team environment):

12. Tools for OSINT Reconnaissance

13. MITRE ATT&CK Mapping

Summary

Related Tutorials

References

Active OSINT: DNS, Certificate Transparency, and Subdomain Enumeration

1. Why Subdomain Enumeration Matters: The Attack Surface Problem

2. DNS Primer for Red Teamers: Records, Zones, and Resolvers

3. Zone Transfer Attacks (AXFR/IXFR): When DNS Gives It All Away

4. Certificate Transparency: The Unintentional Subdomain Registry

5. WHOIS, RDAP, and ASN Enumeration: Mapping the IP Estate

6. Passive DNS Aggregation: Intelligence Without Touching the Target

7. Active DNS Brute-Force: Wordlists, Resolvers, and Wildcard DNS

8. Permutation and Mutation: Finding What Brute-Force Misses

9. Chaining It Together: A Full Enumeration Workflow

10. Subdomain Takeover: From Dangling CNAME to Claimed Asset

11. Common Attacker Techniques

12. Defensive Strategies & Detection

Sysmon and ETW

Network and Resolver-Side Detection

Sigma Rule (DNS brute-force via Sysmon EID 22)

CT Log Monitoring (Defensive)

13. Tools for Subdomain Enumeration Analysis

14. MITRE ATT&CK Mapping

Summary

Related Tutorials

References

Passive OSINT: Mapping the Target Without Touching It

1. What “Passive” Actually Means

2. The OSINT Intelligence Cycle

3. Domain & DNS Reconnaissance

4. Certificate Transparency & Subdomain Enumeration

5. Internet-Wide Scan Databases: Shodan & Censys

6. Search Engine Dorking (Google Hacking)

7. Code Repository Mining

8. Organizational & Personnel Intelligence

9. Breach Data & Credential Exposure

10. BGP, ASN & IP Range Mapping

11. Correlating the Picture: Building a Target Profile

12. Common Attacker Techniques

13. Defensive Strategies & Detection

What Defenders Can Detect (Indirect Signals)

Sigma Sketch — Downstream Credential Use

Hardening / Attack Surface Reduction

14. Tools for OSINT Analysis

15. MITRE ATT&CK Mapping

Summary

Related Tutorials

References