feat(wildcard): add auto-wildcard flag (#924 2/2)#966
feat(wildcard): add auto-wildcard flag (#924 2/2)#966Mzack9999 merged 3 commits intoprojectdiscovery:devfrom
auto-wildcard flag (#924 2/2)#966Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
WalkthroughAdds an optional automatic wildcard-domain detection flag ( Changes
Sequence DiagramsequenceDiagram
participant Client
participant Runner
participant Storage
participant WildcardWorker
participant Resolver
Client->>Runner: start with -auto-wildcard
Runner->>Runner: init A/AAAA wildcard resolvers & root tracking
Runner->>WildcardWorker: start worker pool
loop for each resolved host
Runner->>Storage: load stored DNS data for host
Storage-->>Runner: DNSAnswers (A/AAAA/CNAME)
Runner->>Runner: shouldAutoFilterHost(host, DNSAnswers)
alt cached answers indicate wildcard match
Runner->>Runner: mark host as wildcard (skip output)
else
Runner->>WildcardWorker: enqueue job (host + root)
end
end
loop worker processing
WildcardWorker->>Storage: load host cached DNS data
WildcardWorker->>Resolver: probe wildcard root (A or AAAA)
Resolver-->>WildcardWorker: probe answers
WildcardWorker->>WildcardWorker: compare answers => isWildcard?
alt matches
WildcardWorker->>Runner: add to wildcard set
else
WildcardWorker->>Runner: allow host
end
end
Runner->>Client: output filtered results (exclude marked wildcards)
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Neo - PR Security ReviewNo security issues found Highlights
Comment |
There was a problem hiding this comment.
Actionable comments posted: 4
🧹 Nitpick comments (1)
internal/runner/runner.go (1)
169-172: Avoid building the wildcard probe stack for runs that never use it.This unconditionally allocates two extra
dnsx.New(...)clients and resolver stacks even when neither-wdnor-auto-wildcardis enabled. Guard this behindoptions.hasWildcardFiltering()so the default path stays cheap.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/runner/runner.go` around lines 169 - 172, The code always calls newWildcardResolvers(...) which allocates dnsx.New clients and resolver stacks (producing wildcardAResolver, wildcardAAAAResolver, wildcardAAdapter, wildcardAAAAAdapter, wildcardDomains, wildcardDomainSet) even when wildcard filtering is not used; wrap the call to newWildcardResolvers/dnsx.New behind a conditional check using options.hasWildcardFiltering() so the wildcard probe stack is only created when options.hasWildcardFiltering() returns true, and ensure the variables remain nil/zero-valued otherwise and the error handling/return path unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@internal/runner/options.go`:
- Around line 206-209: Replace the current post-normalization emptiness check
with a real domain validation: after calling
normalizeWildcardDomain(options.WildcardDomain) validate options.WildcardDomain
(and the identical block around the second occurrence) using a
domain/wildcard-domain validator that rejects spaces, consecutive dots, invalid
wildcard placements (e.g., mid-label '*' like "sub.*.example.com"), and other
illegal characters; if validation fails, call gologger.Fatal().Msgf("invalid
wildcard-domain value") (or include a clear message) so invalid inputs like "foo
bar", "example..com" or "sub.*.example.com" are rejected up-front. Ensure you
reference and reuse normalizeWildcardDomain, options.WildcardDomain and
rawWildcardDomain in the check to keep behavior consistent.
In `@internal/runner/runner.go`:
- Around line 63-68: Add invariant checks in New(options *Options) to reject
unsupported option combinations rather than relying on CLI parsing: validate
that if options.WildcardDomain != "" it cannot be combined with
options.AutoWildcard or options.Stream (or any mode where runStream() is used),
and return a descriptive error if violated. Update New to call
normalizeWildcardDomain and then enforce these invariants (referencing New,
Options, AutoWildcard, WildcardDomain, Stream) so callers get immediate errors;
note worker() and runStream() differences and ensure filterWildcardHosts()
expectations are enforced by rejecting combinations that would leave runStream()
without filtering.
In `@internal/runner/wildcard.go`:
- Around line 130-145: startWildcardWorkers is currently called with
len(listIPs) which limits concurrency to IP buckets instead of the number of
host jobs; compute the actual number of jobs (count unique hosts that pass the
WildcardThreshold from ipDomain and would be pushed to wildcardworkerchan) and
call startWildcardWorkers(totalJobs) before enqueuing, e.g. iterate ipDomain to
count qualifying unique hosts (use the same seen logic) and then push
wildcardJob entries to wildcardworkerchan; apply the same fix at the other
occurrence referenced (lines 179-183) so worker count matches queued host jobs
rather than IP count.
In `@README.md`:
- Around line 131-132: Update the README wildcard documentation so it is
consistent: clarify that the -auto-wildcard flag automatically detects wildcard
domains and is mutually exclusive with -wd / -wildcard-domain, and remove or
correct any wording that claims -wd is mandatory for wildcard elimination;
update the configuration list and the earlier section that currently states "wd
is mandatory" to instead describe the mutual-exclusion behavior and that -wd
(aka -wildcard-domain) is only required when not using -auto-wildcard and that
-wd causes other flags to be ignored and forces json output.
---
Nitpick comments:
In `@internal/runner/runner.go`:
- Around line 169-172: The code always calls newWildcardResolvers(...) which
allocates dnsx.New clients and resolver stacks (producing wildcardAResolver,
wildcardAAAAResolver, wildcardAAdapter, wildcardAAAAAdapter, wildcardDomains,
wildcardDomainSet) even when wildcard filtering is not used; wrap the call to
newWildcardResolvers/dnsx.New behind a conditional check using
options.hasWildcardFiltering() so the wildcard probe stack is only created when
options.hasWildcardFiltering() returns true, and ensure the variables remain
nil/zero-valued otherwise and the error handling/return path unchanged.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 129fb998-00bb-4fec-8d05-299e4a4021d7
📒 Files selected for processing (5)
README.mdinternal/runner/options.gointernal/runner/runner.gointernal/runner/wildcard.gointernal/runner/wildcard_test.go
52e823f to
901e644
Compare
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (2)
internal/runner/runner.go (1)
891-903: Use the library’sUnmarshal()path for stored DNS data.
storeDNSData()writes theretryabledns.DNSData.JSON()form, andlookupAndOutput()already reads that payload back viadnsdata.Unmarshal(data). Decoding the same bytes withjson.Unmarshalintroduces a second contract that can drift ifretryablednsadds custom unmarshal logic.Suggested change
import ( "bufio" "context" - "encoding/json" "fmt" "io" "os" "strings" @@ func (r *Runner) loadStoredDNSData(host string) (*retryabledns.DNSData, error) { data, ok := r.hm.Get(host) if !ok { return nil, errors.New("dns data not found") } var dnsdata retryabledns.DNSData - if err := json.Unmarshal(data, &dnsdata); err != nil { + if err := dnsdata.Unmarshal(data); err != nil { return nil, err } return &dnsdata, nil }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/runner/runner.go` around lines 891 - 903, The loadStoredDNSData function currently uses json.Unmarshal to decode stored bytes into retryabledns.DNSData, but storeDNSData() writes the DNSData.JSON() form and lookupAndOutput() uses dnsdata.Unmarshal(data); change loadStoredDNSData to create a retryabledns.DNSData value and call its Unmarshal(data) method (e.g., var dnsdata retryabledns.DNSData; if err := dnsdata.Unmarshal(data); err != nil { return nil, err }) so decoding follows the library's custom unmarshal logic.internal/runner/runner_test.go (1)
165-183: Add the invalid wildcard-domain constructor case here too.The new validation in
New()also rejects inputs that normalize to empty, such as"*.", but the added assertions only cover the mutual-exclusion and stream-mode branches. A smallNew(&Options{WildcardDomain: "*."})case would pin that public behavior at the constructor boundary instead of only exercising the helper.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/runner/runner_test.go` around lines 165 - 183, Add a test case that calls New(&Options{WildcardDomain: "*."}) and asserts the returned runner is nil and that an error is returned; place it alongside the other wildcard-related tests (e.g., in TestNewRejectsWildcardFilteringInStreamMode or as its own TestNewRejectsInvalidWildcardDomain) and assert the error message exactly matches the public constructor error string produced by New (use the same literal error message the helper returns) so the constructor behavior for inputs that normalize to empty (like "*.") is pinned at the New() boundary.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@internal/runner/runner.go`:
- Around line 691-693: The stats undercount happens because prepareInput()
assumes total queries = numHosts * len(QuestionTypes) while
shouldAutoFilterHost() (invoked when r.options.AutoWildcard is true) may issue
extra A/AAAA (or other) probes; update the implementation so counters reflect
actual probes rather than the static assumption: either move/increment the
request counters into the wildcard filtering path (e.g., inside
shouldAutoFilterHost() or the wildcard adapter functions) whenever they issue
internal probes, or change prepareInput() to compute totals by calling a small
helper that accounts for additional probes triggered by shouldAutoFilterHost()
(or add a clear comment documenting stats are approximate when AutoWildcard is
enabled). Ensure you update the same stats variables that prepareInput()
currently increments so reported totals and RPS match real resolver traffic.
In `@README.md`:
- Around line 131-132: The README claim that the manual -wd/--wildcard-domain
flag only supports JSON output is inaccurate; update the README line for "-wd,
-wildcard-domain" to remove "only json output is supported" and instead note
that manual wildcard filtering (e.g., Options{WildcardDomain: \"example.com\"}
as exercised in internal/runner/wildcard_test.go) works with plain host output
as well, and that JSON output is recommended but not required; keep the
mutual-exclusion note with -auto-wildcard.
---
Nitpick comments:
In `@internal/runner/runner_test.go`:
- Around line 165-183: Add a test case that calls New(&Options{WildcardDomain:
"*."}) and asserts the returned runner is nil and that an error is returned;
place it alongside the other wildcard-related tests (e.g., in
TestNewRejectsWildcardFilteringInStreamMode or as its own
TestNewRejectsInvalidWildcardDomain) and assert the error message exactly
matches the public constructor error string produced by New (use the same
literal error message the helper returns) so the constructor behavior for inputs
that normalize to empty (like "*.") is pinned at the New() boundary.
In `@internal/runner/runner.go`:
- Around line 891-903: The loadStoredDNSData function currently uses
json.Unmarshal to decode stored bytes into retryabledns.DNSData, but
storeDNSData() writes the DNSData.JSON() form and lookupAndOutput() uses
dnsdata.Unmarshal(data); change loadStoredDNSData to create a
retryabledns.DNSData value and call its Unmarshal(data) method (e.g., var
dnsdata retryabledns.DNSData; if err := dnsdata.Unmarshal(data); err != nil {
return nil, err }) so decoding follows the library's custom unmarshal logic.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 0dd5c5cd-39d0-459d-8286-64831025b932
📒 Files selected for processing (6)
README.mdinternal/runner/options.gointernal/runner/runner.gointernal/runner/runner_test.gointernal/runner/wildcard.gointernal/runner/wildcard_test.go
🚧 Files skipped from review as they are similar to previous changes (2)
- internal/runner/options.go
- internal/runner/wildcard.go
901e644 to
d0b8784
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (1)
README.md (1)
131-132:⚠️ Potential issue | 🟡 MinorManual
-wdis still documented as JSON-only.
lookupAndOutput()falls back to plain host output when-jis off, so this help text is still too strong. It also conflicts with the Notes section below, which already treats JSON as recommended rather than required.📝 Suggested wording
- -wd, -wildcard-domain string domain name for manual wildcard filtering (mutually exclusive with -auto-wildcard; other flags will be ignored - only json output is supported) + -wd, -wildcard-domain string domain name for manual wildcard filtering (mutually exclusive with -auto-wildcard; plain host output still works, but JSON is recommended when you want record details preserved)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@README.md` around lines 131 - 132, Update the README help text for the manual wildcard flag to remove the claim that "-wd, -wildcard-domain" is JSON-only and instead indicate it is mutually exclusive with "-auto-wildcard" but supports normal output mode as well; reference the behavior in lookupAndOutput() that falls back to plain host output when "-j" is off and ensure the line for "-wd, -wildcard-domain" no longer states "(only json output is supported)" while keeping the mutual-exclusion note with "-auto-wildcard".
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@internal/runner/wildcard.go`:
- Around line 207-235: shouldAutoFilterHost() currently always performs both A
and AAAA wildcard fallbacks which causes results to be suppressed for runs
limited to a single address family; update the logic to only probe the other
family when that family is actually active in the current query mode. Constrain
the aAnswers fallback (wildcardAAdapter.lookup / wildcardAResolver.LookupHost)
to run only if A-family queries are enabled, and likewise run the aaaaAnswers
fallback (wildcardAAAAAdapter.lookup / wildcardAAAAResolver.LookupHost) only
when AAAA-family queries are enabled; keep the existing behavior and checks for
when answers are already present, but skip the lookup+resolver steps for the
inactive family so an active-only run (e.g., -a or -aaaa) cannot be vetoed by
the inactive-family wildcard match.
---
Duplicate comments:
In `@README.md`:
- Around line 131-132: Update the README help text for the manual wildcard flag
to remove the claim that "-wd, -wildcard-domain" is JSON-only and instead
indicate it is mutually exclusive with "-auto-wildcard" but supports normal
output mode as well; reference the behavior in lookupAndOutput() that falls back
to plain host output when "-j" is off and ensure the line for "-wd,
-wildcard-domain" no longer states "(only json output is supported)" while
keeping the mutual-exclusion note with "-auto-wildcard".
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 11054591-0219-440b-84ee-16a32bbe6ebe
📒 Files selected for processing (6)
README.mdinternal/runner/options.gointernal/runner/runner.gointernal/runner/runner_test.gointernal/runner/wildcard.gointernal/runner/wildcard_test.go
✅ Files skipped from review due to trivial changes (1)
- internal/runner/wildcard_test.go
🚧 Files skipped from review as they are similar to previous changes (1)
- internal/runner/options.go
| aAnswers := dnsDataAAnswers(dnsdata) | ||
| if len(aAnswers) == 0 { | ||
| resolved, err := r.wildcardAAdapter.lookup(host) | ||
| if err != nil { | ||
| gologger.Debug().Msgf("failed wildcard A lookup for %s: %v\n", host, err) | ||
| } else { | ||
| aAnswers = resolved | ||
| } | ||
| } | ||
| if len(aAnswers) > 0 { | ||
| if matched, _ := r.wildcardAResolver.LookupHost(host, aAnswers); matched { | ||
| return true | ||
| } | ||
| } | ||
|
|
||
| // check if original ip are among wildcards | ||
| for a := range orig { | ||
| if _, ok := wildcards[a]; ok { | ||
| aaaaAnswers := dnsDataAAAAAnswers(dnsdata) | ||
| if len(aaaaAnswers) == 0 { | ||
| resolved, err := r.wildcardAAAAAdapter.lookup(host) | ||
| if err != nil { | ||
| gologger.Debug().Msgf("failed wildcard AAAA lookup for %s: %v\n", host, err) | ||
| } else { | ||
| aaaaAnswers = resolved | ||
| } | ||
| } | ||
| if len(aaaaAnswers) > 0 { | ||
| if matched, _ := r.wildcardAAAAResolver.LookupHost(host, aaaaAnswers); matched { | ||
| return true | ||
| } | ||
| } |
There was a problem hiding this comment.
Don't let an unselected address family suppress visible results.
shouldAutoFilterHost() always falls back to both A and AAAA wildcard checks. In an -a run, a host with an explicit A record but wildcard-only AAAA will be dropped once the fallback AAAA probe matches, even though AAAA would never be emitted; the inverse applies to -aaaa. Gate the second family on the active query mode, and only probe both families for modes that actually need backing-address checks. A mixed-family regression for -a + wildcard AAAA and -aaaa + wildcard A would lock this down.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@internal/runner/wildcard.go` around lines 207 - 235, shouldAutoFilterHost()
currently always performs both A and AAAA wildcard fallbacks which causes
results to be suppressed for runs limited to a single address family; update the
logic to only probe the other family when that family is actually active in the
current query mode. Constrain the aAnswers fallback (wildcardAAdapter.lookup /
wildcardAResolver.LookupHost) to run only if A-family queries are enabled, and
likewise run the aaaaAnswers fallback (wildcardAAAAAdapter.lookup /
wildcardAAAAResolver.LookupHost) only when AAAA-family queries are enabled; keep
the existing behavior and checks for when answers are already present, but skip
the lookup+resolver steps for the inactive family so an active-only run (e.g.,
-a or -aaaa) cannot be vetoed by the inactive-family wildcard match.
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
README.md (1)
474-477: Minor style fix: use "cannot" instead of "can not".Per standard English convention and static analysis, "cannot" is preferred over "can not" unless emphasizing "not".
📝 Suggested fix
- `-auto-wildcard` automatically detects wildcard domains across multiple registrable roots in a single run. -- Domain name (`-wd`) is required only for manual wildcard filtering and can not be used together with `-auto-wildcard`. +- Domain name (`-wd`) is required only for manual wildcard filtering and cannot be used together with `-auto-wildcard`. - When using manual wildcard filtering with `-wd`, other DNS record flags are ignored and JSON output is recommended.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@README.md` around lines 474 - 477, Replace the nonstandard "can not" with "cannot" in the README bullet that describes the Domain name (`-wd`) flag so the line reads: "Domain name (`-wd`) is required only for manual wildcard filtering and cannot be used together with `-auto-wildcard`." Update the text that references flags `-auto-wildcard`, `-wd`, `l`, and `w` accordingly to keep spelling consistent.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@internal/runner/runner.go`:
- Around line 704-706: shouldAutoFilterHost currently probes both A and AAAA
unconditionally; change it to respect the runner options by only performing the
A-family probe when r.options.A is true and only performing the AAAA-family
probe when r.options.AAAA is true (i.e., gate the calls to the A/AAAA lookup
helpers inside shouldAutoFilterHost on r.options.A and r.options.AAAA
respectively) so wildcard filtering does not suppress results from the family
the user explicitly requested.
---
Nitpick comments:
In `@README.md`:
- Around line 474-477: Replace the nonstandard "can not" with "cannot" in the
README bullet that describes the Domain name (`-wd`) flag so the line reads:
"Domain name (`-wd`) is required only for manual wildcard filtering and cannot
be used together with `-auto-wildcard`." Update the text that references flags
`-auto-wildcard`, `-wd`, `l`, and `w` accordingly to keep spelling consistent.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: bf6a0ac3-1ebe-43c0-8346-58288535a75f
📒 Files selected for processing (6)
README.mdinternal/runner/options.gointernal/runner/runner.gointernal/runner/runner_test.gointernal/runner/wildcard.gointernal/runner/wildcard_test.go
✅ Files skipped from review due to trivial changes (2)
- internal/runner/runner_test.go
- internal/runner/wildcard_test.go
| if r.options.AutoWildcard && r.shouldAutoFilterHost(domain, dnsData.DNSData) { | ||
| continue | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Check if shouldAutoFilterHost gates A/AAAA probes based on options.A / options.AAAA
rg -n -A 20 'func.*shouldAutoFilterHost' internal/runner/wildcard.go | head -60Repository: projectdiscovery/dnsx
Length of output: 711
🏁 Script executed:
#!/bin/bash
# Get the complete shouldAutoFilterHost function
rg -n -A 50 'func.*shouldAutoFilterHost' internal/runner/wildcard.goRepository: projectdiscovery/dnsx
Length of output: 1201
🏁 Script executed:
#!/bin/bash
# Check the call site at lines 704-706
sed -n '700,710p' internal/runner/runner.go
# Check if options.A and options.AAAA exist
rg -n 'options\.(A|AAAA)\b' internal/runner/runner.go | head -20Repository: projectdiscovery/dnsx
Length of output: 506
Verify that shouldAutoFilterHost() respects the active query mode before probing both A and AAAA families.
The function currently probes both A and AAAA records unconditionally (lines 213-241 in wildcard.go), regardless of whether the user requested only A records (-a) or only AAAA records (-aaaa). This causes legitimate results to be suppressed when a host has valid records in the queried family but wildcard records in the other family. For example, if a user queries for A records only and a host has both a valid A record and a wildcard AAAA record, the function will filter out the host based on the AAAA wildcard match alone.
Gate the A and AAAA probes by checking r.options.A and r.options.AAAA before performing the respective lookups to ensure filtering respects the user's query intent.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@internal/runner/runner.go` around lines 704 - 706, shouldAutoFilterHost
currently probes both A and AAAA unconditionally; change it to respect the
runner options by only performing the A-family probe when r.options.A is true
and only performing the AAAA-family probe when r.options.AAAA is true (i.e.,
gate the calls to the A/AAAA lookup helpers inside shouldAutoFilterHost on
r.options.A and r.options.AAAA respectively) so wildcard filtering does not
suppress results from the family the user explicitly requested.
|
Closes #924 |
Proposed changes
Fixes #924
/claim #924
--auto-wildcardmode that automatically detects and filters wildcard-backed results across multiple registrable domains in a single run-wdflow unchanged, while making--auto-wildcardmutually exclusive with itshufflednsintogithub.com/projectdiscovery/utils/dns/wildcardinstead of introducing a dnsx-only implementation (feat: add shared wildcard detector package (dnsx#924 1/2) utils#732)dnsxoutput behavior for the selected query mode, while using internalA/AAAAprobes for wildcard decisions when needed-wdinput and add regression coverage for multi-root filtering, threshold independence, AAAA-only filtering, CNAME-visible output, and wildcard-domain normalizationProof
go test ./...Manual check:
--auto-wildcardmanual run returns 4 hosts:Checklist
Summary by CodeRabbit
New Features
Bug Fixes
Documentation
Tests