Active OSINT: DNS, Certificate Transparency, and Subdomain Enumeration
Objective: Understand how an authorized red teamer methodically maps an organization’s external DNS attack surface — from zero-noise passive Certificate Transparency mining to active brute-force resolution — and how defenders detect each technique at the protocol, log, and SIEM level.
1. Why Subdomain Enumeration Matters: The Attack Surface Problem
An organization’s externally reachable footprint is rarely the handful of hostnames it advertises. Missed subdomains mean missed attack surface: forgotten admin panels, staging environments, internal APIs accidentally exposed, and legacy services that were never meant to be public. Each undiscovered host is a node the defender is not monitoring and the operator can pivot through.
Enumeration is a multi-source intelligence-gathering process, not a single tool run. A mature workflow combines passive aggregation, public technical databases, and active resolution to build the most complete asset inventory possible. The skill is sequencing those techniques from quietest to loudest so the operator controls exactly how much signal they generate.
All techniques below fall under MITRE’s Reconnaissance tactic (TA0043). Run them only inside an authorized scope.
2. DNS Primer for Red Teamers: Records, Zones, and Resolvers
DNS resolution flows through a chain: a recursive resolver queries the root, then the TLD nameservers, then the authoritative NS for the zone. The authoritative server holds the records that matter to recon. Each record type leaks distinct intelligence.
| Record | Function |
|---|---|
A / AAAA | IPv4 / IPv6 address mapping for a hostname |
CNAME | Canonical name alias — critical for subdomain takeover identification |
MX | Mail exchange — reveals mail infrastructure and phishing pivot targets |
NS | Authoritative nameserver — identifies zone ownership and AXFR targets |
TXT | Freeform text — SPF (v=spf1), DKIM, DMARC (v=DMARC1), verification tokens often expose third-party services |
SOA | Start of Authority — primary NS, contact email, serial, refresh, retry, expire, minimum TTL |
PTR | Reverse DNS — maps IP → hostname, used in reverse-range sweeps |
SRV | Service locator — reveals app-layer services (_ldap._tcp, _sip._tcp) |
Enumerate record types directly with dig:
dig A target.com +short
dig NS target.com +short
dig MX target.com +short
dig TXT target.com +short # SPF/DMARC reveal third-party SaaS
dig SOA @ns1.target.com target.comTXT recon is high-value: SPF includes (include:_spf.salesforce.com) and verification tokens fingerprint exactly which cloud and SaaS providers an organization uses.
3. Zone Transfer Attacks (AXFR/IXFR): When DNS Gives It All Away
A zone transfer exists so a secondary nameserver can replicate a zone from the primary. A full transfer is DNS query type AXFR; an incremental transfer is IXFR. If an authoritative server answers an AXFR from an unauthorized client, it dumps the entire zone — every record, in one transaction.
dig axfr @ns1.target.com target.comA correctly hardened server returns Transfer failed. or a refusal. A misconfigured one returns the full record set. dnsrecon automates the test across all discovered nameservers:
dnsrecon -d target.com -t axfrMost modern configurations restrict AXFR to whitelisted secondary IPs, so success is rare — but the cost of the check is one query, and a hit collapses the entire enumeration phase into a single response.
4. Certificate Transparency: The Unintentional Subdomain Registry
Certificate Transparency (CT), defined in RFC 6962, is an open framework of public append-only logs recording every certificate issued by publicly trusted CAs. Browsers require that each certificate be logged to at least two CT logs before they accept it. The side effect: a comprehensive, searchable record of every subdomain any certificate ever covered.
Two fields carry the intelligence: the Common Name (CN) and the Subject Alternative Names (SANs). SANs are the modern standard for declaring which domains a certificate covers, and a single certificate can list dozens of subdomains. crt.sh exposes both through its name_value field.
Query the JSON API with a % wildcard prefix and extract uniques:
import requests
def crtsh_subdomains(domain):
url = f"https://crt.sh/?q=%.{domain}&output=json"
r = requests.get(url, timeout=30)
subs = set()
for entry in r.json():
for name in entry["name_value"].splitlines():
subs.add(name.lstrip("*.").lower()) # strip wildcard prefix
return sorted(subs)
for s in crtsh_subdomains("target.com"):
print(s)For large zones, query the backing PostgreSQL database directly — faster and not rate-limited like the web frontend:
import psycopg2
conn = psycopg2.connect(host="crt.sh", port=5432, dbname="certwatch", user="guest")
cur = conn.cursor()
cur.execute("""
SELECT ci.NAME_VALUE FROM certificate_identity ci
WHERE ci.NAME_TYPE = 'dNSName'
AND reverse(lower(ci.NAME_VALUE)) LIKE reverse(lower(%s));
""", ("%.target.com",))
subs = {row[0].lstrip("*.").lower() for row in cur.fetchall()}
print("\n".join(sorted(subs)))NAME_TYPE = 'dNSName' filters to DNS SANs only. Other CT aggregators include Censys (search.censys.io), Facebook CT (developers.facebook.com/tools/ct/), and the Google Transparency Report. CT logs ingest within minutes of issuance; crt.sh and Certspotter typically surface new certificates within a few hours.

5. WHOIS, RDAP, and ASN Enumeration: Mapping the IP Estate
WHOIS data is held by Regional Internet Registries (RIRs) responsible for allocating domain names and IP resources. RDAP (Registration Data Access Protocol, RFC 7480) is the modern JSON-based successor. Both reveal registrar, creation/expiry dates, nameservers, and registrant organization.
whois target.com # registrar, NS, creation date, registrant org
curl -s https://rdap.verisign.com/com/v1/domain/target.com | jq '.nameservers, .entities'The entities and nameservers arrays in RDAP output map cleanly to the org and infrastructure you correlate elsewhere. From the registrant org you pivot to ASN enumeration via RIPE/ARIN to discover owned IP blocks, then run reverse PTR sweeps across those ranges to recover hostnames not present in any forward record.
6. Passive DNS Aggregation: Intelligence Without Touching the Target
Passive DNS datasets store historical resolution data harvested by third parties. Querying them yields subdomains without your operator ever touching the target’s infrastructure — zero target-side signal.
| Tool | Role |
|---|---|
subfinder | Passive OSINT aggregator across CT logs, passive DNS, APIs |
amass (enum) | Deep multi-source enumeration; passive mode plus ASN enumeration |
theHarvester | OSINT gathering for emails, names, subdomains, IPs, URLs from public sources |
bbot | Recon framework that correlates infrastructure relationships, not just names |
Primary data sources include PassiveTotal/RiskIQ, VirusTotal, SecurityTrails, Shodan, and Censys. Most require API keys configured in the tool’s provider file.
subfinder -d target.com -all -o subs_passive.txt
amass enum -passive -d target.com -o subs_amass.txt
theHarvester -d target.com -b crtsh,bing,duckduckgoamass is often misunderstood but offers unmatched depth when configured correctly; its passive mode remains a valid quiet alternative to active collection.
7. Active DNS Brute-Force: Wordlists, Resolvers, and Wildcard DNS
Active techniques directly interact with the target’s DNS infrastructure. The core mechanic: iterate a wordlist, prepend each word as a label (dev.target.com), issue an A/AAAA query, and record responses.
| Tool | Primary Mechanic |
|---|---|
massdns | High-throughput async resolver via custom resolver list |
puredns | massdns wrapper with wildcard detection and deduplication |
shuffledns | massdns brute-forcer with valid-resolver shuffling |
dnsx | DNS probing and record-type enumeration |
gobuster dns | Wordlist DNS brute force |
dnsenum | Zone transfer attempts plus brute-force |
The critical hazard is wildcard DNS: if *.target.com resolves to a catch-all IP, every guess returns a positive. Tools must detect and filter this. puredns handles wildcard detection and deduplication natively:
puredns bruteforce wordlist.txt target.com \
-r resolvers.txt -w resolved.txtResolver selection matters — use a curated list of validated public resolvers (e.g., trickest/resolvers) so queries distribute and stay accurate. Wordlists drive coverage: SecLists dns-Jhaddix.txt and Commonspeak2 are standard. Distributing queries across many resolvers also smears per-source detection thresholds.
8. Permutation and Mutation: Finding What Brute-Force Misses
Brute-force only finds words in your list. Permutation generates variants of already-discovered subdomains — taking api and producing api-dev, api-v2, api-staging, internal-api. altdns and dnsgen perform this mutation.
PATTERNS = ["dev", "staging", "prod", "v2", "internal", "test"]
def mutate(known_subs, base):
out = set()
for host in known_subs:
label = host.replace(f".{base}", "")
for p in PATTERNS:
out.add(f"{label}-{p}.{base}") # api -> api-dev.target.com
out.add(f"{p}-{label}.{base}") # api -> dev-api.target.com
return out
# feed mutations back into dnsx for resolutionPipe the generated candidates straight into dnsx to resolve only the survivors. Permutation routinely surfaces staging hosts that follow internal naming conventions no public wordlist contains.
9. Chaining It Together: A Full Enumeration Workflow
The value is in the pipeline. Aggregate names, resolve them, probe live services, then validate. Each stage adds a column of intelligence:
subfinder -d target.com -o subs.txt # passive aggregation
dnsx -l subs.txt -a -resp -o resolved.txt # keep only resolvers
httpx -l resolved.txt -title -status-code -tech-detect \
-o live.txt # live HTTP fingerprintsubfinder supplies the candidate set, dnsx discards dead names and records the answers, and httpx confirms which hosts serve HTTP, their titles, status codes, and detected technologies. Downstream, aquatone or gowitness screenshot each live host for triage at scale, and subjack checks for takeover. CT logs and passive DNS feed the top of the funnel; active brute-force and permutation widen it; HTTP probing and screenshotting prioritize what to investigate.

10. Subdomain Takeover: From Dangling CNAME to Claimed Asset
Enumeration frequently uncovers dangling CNAMEs — a subdomain whose CNAME points to a deprovisioned cloud service (GitHub Pages, Heroku, AWS S3, Azure, Fastly). If the operator can re-register that external resource, they serve content from the victim’s trusted subdomain. This is the primary takeover vector.
subjack fingerprints CNAME chains against known-vulnerable service responses:
subjack -w resolved.txt -t 100 -timeout 30 \
-c fingerprints.json -vA positive result means a subdomain’s CNAME chain terminates at an unclaimed external resource. In an authorized engagement, validate the finding against the can-i-take-over-xyz reference list and report it through responsible disclosure — do not claim the resource unless the rules of engagement explicitly permit proof-of-concept takeover.
11. Common Attacker Techniques
| Technique | Description |
|---|---|
| Zone transfer (AXFR) | Dump an entire zone from a misconfigured authoritative NS in one query |
| CT log mining | Harvest CN/SAN fields to recover the full historical subdomain namespace |
| Passive DNS query | Recover subdomains from third-party resolution history with zero target contact |
| DNS brute-force | Resolve a wordlist of guessed labels against the target’s resolvers |
| Permutation mutation | Generate naming variants of known hosts to find staging/internal services |
| Reverse PTR sweep | Map owned ASN/IP blocks back to hostnames |
| Subdomain takeover | Claim a deprovisioned cloud resource behind a dangling CNAME |
The progression matters operationally: CT logs, WHOIS/RDAP, and passive DNS generate zero target-side signal, while AXFR, brute-force, and HTTP probing are increasingly noisy and detectable.

12. Defensive Strategies & Detection
CT mining, WHOIS/RDAP, and passive DNS queries occur entirely outside the target’s infrastructure and generate no SIEM-visible events at collection time. Detection therefore concentrates on the active phases.
| Activity | Signal Generated |
|---|---|
| AXFR attempt | Single large TCP/53 transaction to authoritative NS; refusals still log |
| DNS brute-force | High-volume NXDomain responses from one source IP in a short window |
| CT / WHOIS / passive DNS | None — third-party or public registry |
Active resolution (massdns) | High NXDomain rate; resolver-distributed queries may evade per-source detection |
HTTP probing (httpx) | Web server access logs; WAF hits on rapid host sweeps |
Sysmon and ETW
Sysmon Event ID 22 (DNSEvent) logs DNS queries made through the Windows DnsQuery_* API calls in dnsapi.dll, supported on Windows 8.1 and above via ETW. This catches recon tooling run from a compromised Windows host, recording QueryName, QueryStatus, and QueryResults. The underlying provider is Microsoft-Windows-DNS-Client (GUID {1C95126E-7EEA-49A9-A3FE-A378B03DDB4D} — verify against current Windows documentation).
Network and Resolver-Side Detection
- Flag source IPs generating more than N
NXDomainresponses per minute; brute-force tools generate hundreds per second. - DNS Response Policy Zones (RPZ) and authoritative server logs capture all inbound queries, including refused AXFR attempts.
- Restrict AXFR with
allow-transfer(BIND) or transfer ACLs (Windows DNS Server) to whitelisted secondaries only. - Enable Response Rate Limiting (RRL) to slow brute-force resolution.
Sigma Rule (DNS brute-force via Sysmon EID 22)
title: DNS Subdomain Brute-Force (High NXDomain Rate)
logsource:
product: windows
category: dns_query # maps to Sysmon EventID 22
detection:
selection:
QueryStatus: 'NXDOMAIN' # DNS_ERROR_RCODE_NAME_ERROR (9003)
condition: selection | count() by SourceIp > 200 within 1m
fields:
- QueryName
- QueryStatus
- QueryResults
- Image
level: mediumCT Log Monitoring (Defensive)
Defenders can flip CT against the attacker: subscribe to Certspotter (SSLMate), crt.sh alerts, or the Facebook CT monitoring API to receive near-real-time alerts on certificates newly issued for your domain tree. Combined with regular self-enumeration to detect unauthorized subdomain creation, dangling-CNAME audits, and accurate published SPF/DMARC/DKIM TXT records, this closes most of the gaps recon exploits.
13. Tools for Subdomain Enumeration Analysis
| Tool | Description | Link |
|---|---|---|
dig / dnsrecon | Record enumeration and AXFR testing | — |
crt.sh | Certificate Transparency search and JSON/PostgreSQL API | crt.sh |
subfinder | Passive multi-source subdomain aggregation | github.com |
amass | Deep enumeration plus ASN mapping | github.com |
puredns / massdns | Wildcard-aware high-throughput brute-force | github.com |
dnsx / httpx | Resolution and live HTTP probing | github.com |
theHarvester | OSINT email/host/IP gathering | github.com |
subjack | Subdomain takeover fingerprinting | github.com |
Censys / Shodan | Internet-wide scan and certificate databases | search.censys.io |
Certspotter | Defensive CT certificate monitoring | sslmate.com |
14. MITRE ATT&CK Mapping
| Technique | MITRE ID | Detection |
|---|---|---|
| Active Scanning | T1595 | High NXDomain rate; resolver and firewall logs |
| Active Scanning: Scanning IP Blocks | T1595.001 | Reverse PTR sweeps across ASN ranges |
| Gather Victim Network Information | T1590 | Umbrella — DNS/network infrastructure gathering |
| Gather Victim Network Information: DNS | T1590.002 | AXFR attempts logged at authoritative NS |
| Search Open Technical Databases | T1596 | No target-side signal; out-of-band collection |
| Open Technical Databases: DNS/Passive DNS | T1596.001 | Third-party passive DNS — no local visibility |
| Open Technical Databases: WHOIS | T1596.002 | Public registry query — no local visibility |
| Open Technical Databases: Scan Databases | T1596.005 | CT log / Shodan / Censys mining; verify against live ATT&CK page |
All map to Reconnaissance (TA0043). The defining split: T1595 is active and detectable, while the T1596 family is passive and invisible to the target at collection time.
Summary
- External DNS attack surface is far larger than what an organization advertises, and missed subdomains are missed attack surface.
- DNS records, AXFR misconfigurations, and Certificate Transparency CN/SAN fields each leak distinct, attack-relevant intelligence about hosts and infrastructure.
- Passive sources (CT logs, WHOIS/RDAP, passive DNS) generate zero target-side signal; active brute-force and HTTP probing are detectable through high
NXDomainrates and access logs. - Detect active recon via Sysmon Event ID 22 DNS query logging, resolver
NXDomainrate thresholds, and RPZ/AXFR refusal logs. - Defend by restricting AXFR, removing dangling CNAMEs, rate-limiting resolvers, and monitoring your own domains in CT logs with Certspotter for near-real-time certificate alerts.
Related Tutorials
- OSINT for People and Credentials: LinkedIn, Breach Data, and Email Harvesting
- Passive OSINT: Mapping the Target Without Touching It
- Phishing Campaign Design: Pretexting, Lures, and Target Profiling
- Building a Red Team Lab: Infrastructure, VMs, and C2 Setup
- OPSEC Principles for Red Teamers: Staying Undetected
References
- MITRE ATT&CK: Gather Victim Network Information: DNS (T1590.002)
- MITRE ATT&CK: Search Open Technical Databases: DNS/Passive DNS (T1596.001)
- MITRE ATT&CK: Search Open Technical Databases: Digital Certificates (T1596.003)
- MITRE ATT&CK: Search Open Technical Databases: Scan Databases (T1596.005)
- RFC 9162: Certificate Transparency Version 2.0 (IETF)
- RFC 6962: Certificate Transparency (IETF RFC Editor)
Passive OSINT: Mapping the Target Without Touching It
Objective: Understand how authorized red teamers and defenders build a complete external attack-surface picture of an organization using only public, third-party data sources — generating zero packets to target systems — and how defenders run the same exercise against themselves to shrink that exposure.
1. What “Passive” Actually Means
Passive reconnaissance never interacts with the target’s own infrastructure. Every byte you read comes from a third-party aggregator — a registrar’s WHOIS server, a certificate transparency log, Shodan’s index, a breach database, a search-engine cache. The target’s web servers, DNS resolvers, and firewalls log nothing, because you never send them anything. That property is the entire point: passive OSINT leaves no forensic trail on the defender’s systems.
This contrasts directly with Active Scanning (T1595), where you resolve hostnames against the target’s authoritative nameservers, fingerprint services, or port-scan a CIDR. Active scanning touches the target and is logged. T1595 is explicitly out of scope here — it is the technique that begins the moment passive recon ends.
Authorization first. Run these techniques only against organizations you are contractually authorized to assess, within a signed Rules of Engagement (RoE) and defined scope. Querying public databases is legal in most jurisdictions, but acting on harvested credentials or accessing exposed services is not — that is active intrusion, governed by your authorization.
All techniques below map to MITRE ATT&CK Tactic: Reconnaissance (TA0043).
2. The OSINT Intelligence Cycle
Unstructured “Googling the company” wastes time and produces noise. Disciplined OSINT follows a repeatable cycle driven by intelligence requirements defined before any tool runs.
| Phase | Activity |
|---|---|
| Planning | Define intelligence requirements: what assets, people, or exposures matter to the engagement |
| Collection | Gather raw data from open sources (CT logs, Shodan, DNS, dorks) |
| Processing | Clean and normalize results, deduplicate, validate sources |
| Analysis | Link normalized data to the target to determine if an exposure is reachable |
| Dissemination | Route findings to stakeholders with remediation steps |
| Continuous Monitoring | Automate the cycle for ongoing exposure enrichment |
Below is the full passive source landscape this tutorial works through.
| Source Category | Tool / Service | What It Yields |
|---|---|---|
| DNS & WHOIS | dig, host, SecurityTrails | Registrar, nameservers, mail providers, subdomains |
| Certificate Transparency | crt.sh, CertSpotter | Every issued cert — forgotten dev/staging subdomains |
| Passive DNS | SecurityTrails, CIRCL pDNS | Historical domain-to-IP relationships over time |
| Scan Databases | Shodan, Censys, ZoomEye | Indexed service banners, open ports, product versions |
| Search Dorking | Google, Bing (GHDB) | Exposed panels, config files, directory listings |
| Code Repositories | GitHub, GitLab | Internal hostnames, tooling, leaked secrets |
| Social / HUMINT | LinkedIn, job boards | Org structure, tech stack, key personnel |
| Breach Databases | HIBP, DeHashed | Exposed employee credentials |
| Web Archives | Wayback Machine | Old endpoints and removed infrastructure |
| BGP / ASN | BGPView, RIPE, ARIN | ASN, owned prefixes, upstream providers |
| Cloud / Shadow IT | GrayhatWarfare | Exposed S3/Azure/GCP buckets |

3. Domain & DNS Reconnaissance
Start with the apex domain. WHOIS/RDAP reveals registrar, registration dates, and (where not privacy-protected) ownership contacts. DNS record enumeration against public resolvers — not the target’s nameservers — exposes mail providers, CDN usage, and SPF/DMARC posture.
# Enumerate core DNS records via a public resolver (no packets to the target)
for rr in A MX NS TXT SOA; do
echo "== $rr =="
dig +short @1.1.1.1 example.com $rr
done
# MX reveals the mail provider; TXT reveals email-auth posture
dig +short example.com MX # e.g. *.mail.protection.outlook.com -> M365
host -t TXT _dmarc.example.com # p=none vs p=reject tells you spoofabilityAn MX pointing to mail.protection.outlook.com identifies Microsoft 365; an SPF record ending in ~all instead of -all, or a missing DMARC policy, flags mail-spoofing potential. This covers Domain Properties (T1590.001) and DNS (T1590.002).
4. Certificate Transparency & Subdomain Enumeration
Under RFC 6962, every publicly trusted CA logs each certificate it issues to append-only, monitorable CT logs. That means every SSL certificate ever issued for a domain is searchable — including certs for staging, dev, and legacy VPN subdomains defenders forgot existed.
# Pull every CT-logged cert for *.example.com and extract unique hostnames
curl -s 'https://crt.sh/?q=%25.example.com&output=json' \
| jq -r '.[].name_value' \
| sed 's/\*\.//g' \
| sort -uAggregators wrap CT and dozens of other passive feeds. Run them in passive mode so they never resolve against the target:
# Passive mode: third-party data sources only, no resolution against the target
amass enum -passive -d example.com -o subs.txt
# Contrast: 'amass enum -active' resolves and brute-forces -> NOT passive, out of scopeForgotten subdomains like legacy-vpn.example.com or jenkins-dev.example.com are gold: they often run unpatched software outside the patch-management lifecycle. This is Digital Certificates (T1596.003).
5. Internet-Wide Scan Databases: Shodan & Censys
Shodan and Censys continuously crawl the entire IPv4 space and index the banner metadata that devices return on open ports — web servers, routers, databases, ICS/OT, cloud instances. Querying their index is fully passive: they touched the target months ago; you only read the cache.
import shodan
api = shodan.Shodan("YOUR_API_KEY") # querying Shodan's index, not the target
results = api.search('org:"Example Corp"')
print(f"Total results: {results['total']}")
for host in results["matches"][:25]: # respect plan rate limits
ip = host["ip_str"]
port = host["port"]
prod = host.get("product", "")
print(f"{ip}:{port}\t{prod}")Pivot with filters: org:, asn:AS64500, port:3389, product:Elasticsearch. Exposed RDP (3389), unauthenticated Elasticsearch (9200), and VPN gateway banners directly enumerate Software (T1592.002), Hardware (T1592.001), and Network Security Appliances (T1590.006). Cross-reference Shodan IPs with your CT-derived subdomains to attach service data to named hosts. This is Scan Databases (T1596.005).
6. Search Engine Dorking (Google Hacking)
Search operators surface content the target inadvertently exposed and the engine indexed. The Google Hacking Database (GHDB) catalogs thousands of proven patterns. Use these manually in a browser — automated scraping violates ToS and risks blocking.
| Dork | What It Finds |
|---|---|
site:example.com filetype:pdf | Public documents (then mine metadata) |
site:example.com intitle:"index of" | Open directory listings |
site:example.com inurl:admin | Login / admin panels |
site:example.com filetype:env OR filetype:cfg | Exposed config files |
site:example.com intext:"sql syntax near" | Error messages leaking internals |
Combining site: with intitle:, inurl:, and filetype: is remarkably effective. Bing serves as a secondary index that sometimes retains content Google dropped. This covers Search Engines (T1593.002) and Search Victim-Owned Websites (T1594).
7. Code Repository Mining
Public repositories leak internal hostnames, tooling, and — too often — live secrets. Search GitHub/GitLab for the org name, email domains, and internal hostnames discovered earlier. For your own repositories, run secret scanners in CI:
# Audit your OWN org's repos for committed secrets (defensive use)
trufflehog github --org=example-corp --only-verified
gitleaks detect --source . --report-format sarif --report-path leaks.sarifCommit history and job postings reveal the technology stack. This is Code Repositories (T1593.003).
8. Organizational & Personnel Intelligence
LinkedIn is the most complete public database of an organization’s employees — org structure, reporting lines, and the technology stack advertised in job postings. Combine with theHarvester and Hunter.io to derive the email-address convention (first.last@), feeding social-engineering target lists.
Public documents carry metadata that maps directly to usernames and software versions:
import subprocess, json
out = subprocess.run(
["exiftool", "-j", "report.pdf"], capture_output=True, text=True
).stdout
meta = json.loads(out)[0]
for f in ("Creator", "Author", "LastModifiedBy", "Producer"):
if f in meta:
print(f"{f}: {meta[f]}") # e.g. Author: jsmith / Producer: Acrobat 15.0This covers Determine Physical Locations (T1591.001), Identify Roles (T1591.004), Employee Names (T1589.003), and Email Addresses (T1589.002).
9. Breach Data & Credential Exposure
Have I Been Pwned, DeHashed, and credential-log collections reveal when employee credentials have been exposed in third-party breaches — frequently before those credentials are weaponized in credential-stuffing.
import requests
domain = "example.com"
headers = {"hibp-api-key": "YOUR_API_KEY", "user-agent": "authorized-recon"}
url = f"https://haveibeenpwned.com/api/v3/breacheddomain/{domain}"
for alias, breaches in requests.get(url, headers=headers).json().items():
print(alias, "->", ", ".join(breaches))Report the breach name, date, and exposed data classes (passwords, hashes, MFA seeds). Never reuse harvested credentials outside RoE scope — possession is recon; authentication is intrusion. This is Credentials (T1589.001).
10. BGP, ASN & IP Range Mapping
To bound the network footprint, resolve a known IP to its origin ASN, then enumerate every prefix that ASN announces. This delineates owned IP space, co-location, and cloud presence — without scanning a single host.
# 1. Resolve an IP to its origin ASN (Team Cymru WHOIS)
whois -h whois.cymru.com " -v 203.0.113.10"
# 2. Enumerate prefixes announced by that ASN
whois -h whois.radb.net -- '-i origin AS64500' | grep -E '^route:'
# 3. Or use the BGPView API for prefixes + upstreams
curl -s https://api.bgpview.io/asn/64500/prefixes | jq -r '.data.ipv4_prefixes[].prefix'This maps Network Trust Dependencies (T1590.003), IP Addresses (T1590.005), and Network Topology (T1590.004).
11. Correlating the Picture: Building a Target Profile
The real power emerges when you correlate intelligence across platforms — joining a CT-derived subdomain to a Shodan banner to an ASN prefix to a breached employee account reveals patterns invisible in any single source. Document findings in a structured, repeatable report.
# OSINT Target Profile — <Engagement ID>
## 1. Scope & Authorization # RoE ref, authorized domains/ASNs, date window
## 2. Domain & DNS # registrar/RDAP, NS, MX, SPF/DKIM/DMARC posture
## 3. Subdomains (CT + passive) # host | source | state (live / parked / dev)
## 4. Exposed Services (Shodan) # ip:port | product | version | notes
## 5. Network Footprint # ASN | prefixes | hosting / cloud providers
## 6. Personnel & Org # key roles | tech stack | SE surface
## 7. Credential Exposure # breach | date | data classes | accounts
## 8. Risk Summary & Recommendations
12. Common Attacker Techniques
| Technique | Description |
|---|---|
| CT-log subdomain harvesting | Mine crt.sh for forgotten dev/staging/VPN hosts |
| Passive DNS pivoting | Use historical IP↔domain data to map shared infrastructure |
| Shodan/Censys banner mining | Identify exposed RDP, databases, and VPN gateways |
| Google dorking | Surface exposed configs, panels, and error leaks |
| Repo secret mining | Recover API keys and hostnames from public commits |
| LinkedIn org mapping | Build personnel and tech-stack intelligence for phishing |
| Breach-data correlation | Match exposed credentials to active employee accounts |
| Document metadata extraction | Derive usernames/software from public PDFs and DOCX |
These feed downstream Initial Access — phishing the personnel map, password-spraying the breach list, or exploiting the exposed service — all of which occur after passive recon and are logged.
13. Defensive Strategies & Detection
Framing: True passive OSINT generates no logs on your systems — the adversary only queries third-party databases. Defense therefore shifts to attack-surface reduction and detecting downstream use of harvested intelligence, not the recon itself.
What Defenders Can Detect (Indirect Signals)
| Signal | Mechanism | Notes |
|---|---|---|
| New certificate issuance | CT monitors: CertSpotter, crt.sh alerts | Subscribe to alerts for new certs on your domains — proactive |
| Shodan/Censys indexing | Not real-time; scanner IP ranges are published | Block known scanner ranges to reduce exposure |
| Downstream credential use | Windows Security EventID 4625 / 4624 / 4648 | HIBP-known creds appearing in auth logs = stuffing/breach |
| Leaked secrets | GitHub secret scanning, truffleHog/gitleaks in CI | Detect before attackers do |
| Active DNS recon | Defender for Identity DnsReconnaissanceSecurityAlert | Catches active DNS recon — not passive external OSINT |
Sigma Sketch — Downstream Credential Use
title: Possible Use of OSINT-Harvested Credentials
logsource:
product: windows
service: security
detection:
selection:
EventID:
- 4625 # Failed logon (credential stuffing)
- 4648 # Explicit credential logon (harvested creds / PtH)
- 4768 # Kerberos AS-REQ with harvested identity
timeframe: 10m
condition: selection | count() by SourceIp > 20
level: high(Add environment-specific thresholds and allow-lists before deployment.)
Hardening / Attack Surface Reduction
| Mitigation | Description |
|---|---|
| Prune DNS & avoid telling names | Purge stale records; don’t name hosts staging-db.example.com |
| Wildcard certificates | Reduce per-subdomain CT exposure |
| CT + brand monitoring | Alert on new subdomains, certs, and leaked references |
| Email auth hardening | Enforce SPF -all, DKIM, and DMARC p=reject |
| Repo secret scanning | Enable GitHub push protection; run gitleaks in CI |
| Monthly Shodan/Censys review | Audit your own ASN; remediate unexpected ports |
| HIBP Domain Search | Enroll in breach-notification API alerts |
| Metadata stripping | exiftool -all= file.pdf before publishing |
| RDAP/registrar privacy | Reduce WHOIS exposure where legally permissible |
| Policy review | Curb LinkedIn oversharing; manage domain lifecycle |
A defender running this exact exercise against their own organization has a structural advantage: OSINT needs no change-approval window because it touches no production systems, so perimeter assessment carries zero operational impact.

14. Tools for OSINT Analysis
| Tool | Description | Link |
|---|---|---|
| Amass | Passive subdomain enumeration & mapping | owasp.org |
| subfinder | Fast passive subdomain discovery | projectdiscovery.io |
| theHarvester | Emails, names, subdomains from public sources | github.com |
| Shodan / Censys | Internet-wide scan databases | shodan.io |
| Recon-ng / SpiderFoot | Modular OSINT automation frameworks | spiderfoot.net |
| crt.sh | Certificate transparency search | crt.sh |
| SecurityTrails | Passive DNS & historical records | securitytrails.com |
| Have I Been Pwned | Breach & credential exposure | haveibeenpwned.com |
| truffleHog / gitleaks | Repo secret scanning | github.com |
| exiftool | Document metadata extraction | exiftool.org |
| BGPView | ASN & prefix enumeration | bgpview.io |
| Wayback Machine | Historical web snapshots | archive.org |
15. MITRE ATT&CK Mapping
All techniques fall under Reconnaissance (TA0043). T1595 Active Scanning is out of scope.
| Technique | MITRE ID | Detection / Reduction |
|---|---|---|
| Gather Victim Network Information | T1590 (.001–.006) | Prune DNS; reduce WHOIS/CT exposure |
| Gather Victim Org Information | T1591 (.001–.004) | LinkedIn-oversharing policy |
| Gather Victim Host Information | T1592 (.001–.004) | Monthly Shodan/Censys self-audit |
| Search Open Websites/Domains | T1593 (.001–.003) | Repo secret scanning; dork your own site |
| Search Victim-Owned Websites | T1594 | Remove exposed configs/listings |
| Search Open Technical Databases | T1596 (.001–.005) | CT monitoring; passive-DNS hygiene |
| Gather Victim Identity Information | T1589 (.001–.003) | HIBP alerts; auth monitoring (4625/4648) |
Summary
- Passive OSINT maps an organization’s entire external attack surface using only third-party data, generating zero packets — and therefore zero logs — on the target.
- The disciplined intelligence cycle (plan → collect → process → analyze → disseminate → monitor) turns scattered searches into a correlated target profile across DNS, CT logs, scan databases, repos, personnel, and breach data.
- Correlation is the multiplier: joining a forgotten subdomain to a Shodan banner to a breached credential reveals reachable exposure invisible in any single source.
- Because passive recon is undetectable on your systems, defense means attack-surface reduction — CT monitoring, DMARC
p=reject, secret scanning, metadata stripping — plus detecting downstream credential use via WindowsEventID 4625/4648. - All techniques map to ATT&CK Reconnaissance (
TA0043); the boundary isT1595Active Scanning, which begins the moment you touch the target directly.
Related Tutorials
- Phishing Campaign Design: Pretexting, Lures, and Target Profiling
- OSINT for People and Credentials: LinkedIn, Breach Data, and Email Harvesting
- Active OSINT: DNS, Certificate Transparency, and Subdomain Enumeration
- Building a Red Team Lab: Infrastructure, VMs, and C2 Setup
- Position-Independent Code: Writing PIC Shellcode Without Hardcoded Addresses
References
- Reconnaissance, Tactic TA0043 – Enterprise | MITRE ATT&CK®
- Gather Victim Identity Information, Technique T1589 – Enterprise | MITRE ATT&CK®
- Search Open Technical Databases, Technique T1596 – Enterprise | MITRE ATT&CK®
- Search Open Websites/Domains: Search Engines, Sub-technique T1593.002 – Enterprise | MITRE ATT&CK®
- Search Open Technical Databases: DNS/Passive DNS, Sub-technique T1596.001 – Enterprise | MITRE ATT&CK®
- NIST SP 800-115: Technical Guide to Information Security Testing and Assessment