Skip to content

fix: split comma-separated entries from list input#950

Open
hriszc wants to merge 3 commits intoprojectdiscovery:mainfrom
hriszc:codex/fix-list-comma-input
Open

fix: split comma-separated entries from list input#950
hriszc wants to merge 3 commits intoprojectdiscovery:mainfrom
hriszc:codex/fix-list-comma-input

Conversation

@hriszc
Copy link

@hriszc hriszc commented Mar 9, 2026

/claim #859

Summary

Normalize comma-separated entries from -l file input and stdin so they behave like -u, which already supports comma-separated values.

  • splits comma-separated entries for file-backed input lines
  • applies the same normalization to stdin
  • trims whitespace and ignores empty entries
  • adds regression coverage for single, comma-separated, and sparse input lines

Validation

Ran:

go test ./internal/runner -run 'TestSplitInputEntries'

Result: passed.

Summary by CodeRabbit

  • New Features

    • Input processing now supports comma-separated entries across all input sources, automatically splitting and normalizing each item for processing.
  • Bug Fixes / Improvements

    • Improved input-reading robustness by capping scanner buffers and improving error handling when reading large or streamed inputs.
  • Tests

    • Added tests covering comma-separated splitting and normalization for file and stdin input scenarios.

@coderabbitai
Copy link

coderabbitai bot commented Mar 9, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: c5915459-622c-46ae-a750-d9431f2e815d

📥 Commits

Reviewing files that changed from the base of the PR and between 615c0b0 and a3284eb.

📒 Files selected for processing (2)
  • internal/runner/runner.go
  • internal/runner/runner_test.go
🚧 Files skipped from review as they are similar to previous changes (1)
  • internal/runner/runner_test.go

Walkthrough

Refactors input handling to split comma-separated entries before processing. Adds an unexported helper splitInputEntries(text string) []string, updates scanner usage to handle multiple entries per line (including stdin and file reads), and propagates scanner errors. Tests for parsing and queuing were added.

Changes

Cohort / File(s) Summary
Input parsing & processing
internal/runner/runner.go
Added splitInputEntries(text string) []string; set scanner.Buffer with maxInputScanTokenSize; read lines from files/STDIN and iterate over split entries, calling processInputItem() for each; propagate scanner.Err() when present.
Tests for input behavior
internal/runner/runner_test.go
Added TestSplitInputEntries, TestNormalizeAndQueueInputsFromFile, and TestNormalizeAndQueueInputsFromStdin to verify splitting, trimming, empty-entry filtering, and queuing/normalization from file and stdin.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐰 Commas hop and tumble, split apart with glee,
Each tiny entry bounces, nudged into the sea,
Trimmed and tidy, lined in queue, they march to meet the run,
Quiet helper slices, till the job is done—
A rabbit cheers the code, beneath the morning sun.

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 28.57% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'fix: split comma-separated entries from list input' directly and specifically describes the main change: adding support for splitting comma-separated entries in list input files and stdin to match the behavior of the URL input parameter.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
internal/runner/runner_test.go (1)

41-70: Add an end-to-end regression test for -l and stdin paths.

These cases only validate splitInputEntries(). They won't catch regressions in normalizeAndQueueInputs, which is where the new file/stdin behavior actually lives. A temp-file case for options.InputList plus a stdin-backed case would give coverage on the paths this PR changes.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/runner/runner_test.go` around lines 41 - 70, Add an end-to-end
regression test that exercises the actual input-queueing paths (not just
splitInputEntries): create a test (e.g., TestNormalizeAndQueueInputs or
TestInputPaths) that writes a temp file and sets options.InputList to point at
it to validate the "-l" file path, and add a stdin-backed case (use an io.Reader
or os.Stdin replacement) to validate the stdin path; call
normalizeAndQueueInputs with those options and assert the queued/normalized
results match expectations, ensuring the new file/stdin behavior from
normalizeAndQueueInputs is covered instead of only testing splitInputEntries.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@internal/runner/runner.go`:
- Around line 441-456: The scanner usage in runner.go (the loops that call
bufio.NewScanner for files and os.Stdin while iterating splitInputEntries and
calling r.processInputItem) must be made robust: after creating each scanner,
call scanner.Buffer with a larger max token size (e.g., several megabytes) to
allow long comma-separated lines, and after each scanning loop check
scanner.Err() and return or propagate the error instead of dropping input
silently; apply these changes to both the file-reading loop and the hasStdin
loop so long lines from splitInputEntries are not truncated and errors are
surfaced.

---

Nitpick comments:
In `@internal/runner/runner_test.go`:
- Around line 41-70: Add an end-to-end regression test that exercises the actual
input-queueing paths (not just splitInputEntries): create a test (e.g.,
TestNormalizeAndQueueInputs or TestInputPaths) that writes a temp file and sets
options.InputList to point at it to validate the "-l" file path, and add a
stdin-backed case (use an io.Reader or os.Stdin replacement) to validate the
stdin path; call normalizeAndQueueInputs with those options and assert the
queued/normalized results match expectations, ensuring the new file/stdin
behavior from normalizeAndQueueInputs is covered instead of only testing
splitInputEntries.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: c2a1e855-b811-456b-bba3-f92e1074301e

📥 Commits

Reviewing files that changed from the base of the PR and between d13b67f and 615c0b0.

📒 Files selected for processing (2)
  • internal/runner/runner.go
  • internal/runner/runner_test.go

@neo-by-projectdiscovery-dev
Copy link

neo-by-projectdiscovery-dev bot commented Mar 9, 2026

Neo - PR Security Review

No security issues found

Highlights

  • Added TestSplitInputEntries to verify single, comma-separated, and whitespace handling in input parsing
  • Added TestNormalizeAndQueueInputsFromFile to validate file-based input with comma separation
  • Added TestNormalizeAndQueueInputsFromStdin to validate stdin input with comma separation
  • Test additions use proper cleanup patterns with defer and temporary file handling
Hardening Notes
  • The previous hardening suggestions remain valid: consider adding a maximum entry limit per line (e.g., 1000 comma-separated values) in splitInputEntries() at line 469 to prevent resource exhaustion
  • Add input length validation before processing in processInputItem() to reject entries exceeding reasonable hostname/URL length limits (e.g., 2048 characters)

Comment @pdneo help for available commands. · Open in Neo

@hriszc
Copy link
Author

hriszc commented Mar 9, 2026

Addressed the review feedback:

  • added scanner.Buffer(...) and explicit scanner.Err() handling for both file and stdin paths
  • added end-to-end regression coverage for normalizeAndQueueInputs with both -l file input and stdin input
  • reran go test ./internal/runner -run TestSplitInputEntries|TestNormalizeAndQueueInputsFromFile|TestNormalizeAndQueueInputsFromStdin

The PR head has been updated accordingly.

@hriszc
Copy link
Author

hriszc commented Mar 9, 2026

Follow-up: scanner buffering/error handling feedback is already incorporated, checks are green, and the file/stdin regression tests are in place. Ready for maintainer review when convenient.

@hriszc
Copy link
Author

hriszc commented Mar 10, 2026

Checks are green and the scanner/test feedback is already incorporated. This is ready for maintainer review when convenient.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant