Feat/packet inspector query tools by dipeshwalia · Pull Request #30 · good-tools/wiregasm

dipeshwalia · 2026-05-12T16:45:25Z

Summary by CodeRabbit

New Features
- Extract specific header-field values with optional filtering and row limits.
- List present fields across frames with counts and metadata.
- Retrieve a recursive protocol hierarchy with frame/byte aggregates.
Improvements
- Stronger return-code checks and stricter argument handling in APIs.
- JS/TS wrappers now normalize responses into native arrays and typed shapes.
Tests
- Added tests covering extraction, presence listing, protocol hierarchy, and preference behavior.

Avoid duplicate decode-as default reapplication across module traversal and keep PREF_DECODE_AS_RANGE updates merged with protocol defaults to prevent runaway memory growth on Node 20.

…-llm-query-tools

coderabbitai · 2026-05-12T16:45:40Z

Warning

Rate limit exceeded

@dipeshwalia has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 26 minutes and 8 seconds before requesting another review.

You’ve run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 9641629f-2f67-4fed-9b53-ff454e95d3a9

📥 Commits

Reviewing files that changed from the base of the PR and between 4e718c5 and 97e1c91.

📒 Files selected for processing (1)

lib/Makefile

Walkthrough

This PR adds three new data-extraction APIs to Wiregasm: extractFields to retrieve specific field values from dissected packets per frame, listPresentFields to enumerate available fields with occurrence counts, and protocolHierarchy to aggregate protocol-level statistics. The implementation spans C++, Emscripten bindings, and TypeScript, with fixes to preference-range merging and comprehensive test coverage.

Changes

Extract Fields, List Present Fields & Protocol Hierarchy

Layer / File(s)	Summary
Data structures & type contracts `src/types.ts`, `lib/wiregasm/wiregasm.h`	`FieldValue`, `ExtractedRow`, `PresentField`, `ProtocolNode`, and response wrapper types are defined across TypeScript and C++. `DissectSession` gains three new method signatures, and `WiregasmLib` adds `VectorString` constructor for building string vectors.
C++ core processing logic `lib/wiregasm/lib.cpp`, `lib/wiregasm/lib.h`	`wg_session_process_extract_fields`, `wg_session_process_present_fields`, and `wg_session_process_protocol_hierarchy` walk proto_tree structures, accumulate results with optional filtering and truncation, reset dissector state per frame, and return structured responses. Recursive helpers extract field values, enumerate present fields by abbreviation, and aggregate protocol nodes.
Bindings, session wrappers & preference fixes `lib/wiregasm/bindings.cpp`, `lib/wiregasm/wiregasm.cpp`	Emscripten bindings register value_object types and typed vectors for JavaScript serialization. `DissectSession` wrapper methods forward calls to C++ processing functions. Decode-as preference handling is fixed to merge new values against the default range instead of the current range, and module callback logic is simplified to skip submodule traversal.
JavaScript wrapper API `src/index.ts`	Wiregasm class gains `extract_fields`, `list_present_fields`, and `protocol_hierarchy` methods that normalize C++ responses into JavaScript arrays and objects. Type imports are expanded, preference code comparisons are strengthened to strict checks, and `tap`/`iograph` argument handling is improved.
Test coverage `src/index.test.ts`	New tests validate `extract_fields` per-frame output, invalid-filter error handling, and truncation behavior. `list_present_fields` and `protocol_hierarchy` results are verified, including recursive protocol node presence. Preference tests are extended to confirm re-setting and default value preservation.

🎯 3 (Moderate) | ⏱️ ~25 minutes

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title refers to 'packet inspector query tools', which aligns with the PR's addition of three new query/extraction APIs (extractFields, listPresentFields, protocolHierarchy) that enable querying packet/frame data from dissected captures.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

Generate code and open pull requests
Plan features and break down work
Investigate incidents and troubleshoot customer tickets together
Automate recurring tasks and respond to alerts with triggers
Summarize progress and report instantly

Built for teams:

Shared memory across your entire org—no repeating context
Per-thread sandboxes to safely plan and execute work
Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 4

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@lib/wiregasm/lib.cpp`:
- Around line 1097-1099: When wtap_seek_read(cfile->provider.wth,
fdata->file_off, &rec, &rec_buf, &err, &err_info) returns false, the loop
currently breaks without setting response.error causing callers to get
incomplete data; modify the failure path in the block around wtap_seek_read (and
the analogous blocks that call wtap_seek_read at the other two locations) to
populate response.error with the available error information (use err and/or
err_info contents or formatted message) and return/exit early instead of
silently breaking so the caller receives the read failure details.
- Around line 1281-1289: The protocol node counts are being incremented multiple
times per single frame because every FT_PROTOCOL hfinfo match increments
protocol->frames and protocol->bytes; fix this by preventing duplicate
increments within the same frame — e.g., add a per-protocol last-seen frame
marker (a field like ProtocolNode::last_seen_frame_id or last_frame_index) and
update it when you first account for that protocol in the current frame; in the
block where you get ProtocolNode* protocol via wg_find_or_create_protocol_node,
only increment protocol->frames and protocol->bytes if
protocol->last_seen_frame_id != current_frame_id (then set last_seen_frame_id =
current_frame_id), or alternatively maintain a per-frame set of counted protocol
pointers and skip duplicates, ensuring the logic uses the same current frame
identifier used elsewhere in this code path.
- Around line 1025-1044: FieldValue.raw is never set, so populate it with the
original string representation: when finfo->value is present call
fvalue_to_string_repr as you do, capture the returned char* into a temporary
std::string (e.g. std::string repr(display)), assign field_value.value = repr
(or move), and also assign field_value.raw = repr before calling wmem_free(NULL,
display); and in the label fallback branch set field_value.raw =
field_value.value (or the label string) so raw is filled for every pushed
FieldValue; update references to field_value, fvalue_to_string_repr, and
values->push_back accordingly.

In `@src/index.ts`:
- Around line 271-274: The code currently constructs a RegExp from untrusted
query (const regex = new RegExp(query)) and tests field.field/field.name, which
fails for inputs like (?i)http and is vulnerable to ReDoS; replace the regex
approach with a case-insensitive substring match: normalize the query to a
string and lower-case it (e.g., const q = String(query).toLowerCase()) and
filter using field.field.toLowerCase().includes(q) ||
field.name.toLowerCase().includes(q) so no RegExp is compiled and matching is
safe and case-insensitive.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: fb7fba74-67c0-4e9c-9342-508a2a30bfec

📥 Commits

Reviewing files that changed from the base of the PR and between fa9a0ed and 82e8070.

📒 Files selected for processing (8)

lib/wiregasm/bindings.cpp
lib/wiregasm/lib.cpp
lib/wiregasm/lib.h
lib/wiregasm/wiregasm.cpp
lib/wiregasm/wiregasm.h
src/index.test.ts
src/index.ts
src/types.ts

coderabbitai · 2026-05-12T16:50:33Z

+        FieldValue field_value;
+        field_value.field = field_it->second;
+
+        if (finfo->value) {
+          char *display = fvalue_to_string_repr(NULL, finfo->value, FTREPR_DISPLAY, finfo->hfinfo->display);
+          if (display) {
+            field_value.value = display;
+            wmem_free(NULL, display);
+          }
+        }
+
+        if (field_value.value.empty()) {
+          char label_str[ITEM_LABEL_LENGTH];
+          label_str[0] = '\0';
+          proto_item_fill_label(finfo, label_str);
+          field_value.value = string(label_str);
+        }
+
+        values->push_back(field_value);
+      }


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

FieldValue.raw is never populated.

Line 1025–1044 fills field/value only. raw remains empty for all results, which makes the exposed response shape misleading.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@lib/wiregasm/lib.cpp` around lines 1025 - 1044, FieldValue.raw is never set, so populate it with the original string representation: when finfo->value is present call fvalue_to_string_repr as you do, capture the returned char* into a temporary std::string (e.g. std::string repr(display)), assign field_value.value = repr (or move), and also assign field_value.raw = repr before calling wmem_free(NULL, display); and in the label fallback branch set field_value.raw = field_value.value (or the label string) so raw is filled for every pushed FieldValue; update references to field_value, fvalue_to_string_repr, and values->push_back accordingly.

coderabbitai · 2026-05-12T16:50:33Z

+    if (!wtap_seek_read(cfile->provider.wth, fdata->file_off, &rec, &rec_buf, &err, &err_info)) {
+      break;
+    }


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Surface read failures instead of returning partial success silently.

On Line 1097, Line 1213, and Line 1319, read failures break the loop but never populate response.error, so callers can receive incomplete data with an empty error string.

Suggested fix

- if (!wtap_seek_read(cfile->provider.wth, fdata->file_off, &rec, &rec_buf, &err, &err_info)) { - break; - } + if (!wtap_seek_read(cfile->provider.wth, fdata->file_off, &rec, &rec_buf, &err, &err_info)) { + response.error = err_info ? string(err_info) : "Failed to read packet data"; + break; + }

Also applies to: 1213-1215, 1319-1321

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@lib/wiregasm/lib.cpp` around lines 1097 - 1099, When wtap_seek_read(cfile->provider.wth, fdata->file_off, &rec, &rec_buf, &err, &err_info) returns false, the loop currently breaks without setting response.error causing callers to get incomplete data; modify the failure path in the block around wtap_seek_read (and the analogous blocks that call wtap_seek_read at the other two locations) to populate response.error with the available error information (use err and/or err_info contents or formatted message) and return/exit early instead of silently breaking so the caller receives the read failure details.

Avoid compiling untrusted present-field queries as regex by using case-insensitive substring matching with (?i) compatibility, and prevent protocol frames/bytes over-counting when a protocol appears multiple times in a single frame.

coderabbitai

Actionable comments posted: 2

♻️ Duplicate comments (1)

lib/wiregasm/lib.cpp (1)
1272-1300: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Per-frame protocol de-duplication still resets across recursion branches.

counted_protocols is recreated on every recursive call. If the same protocol is reached twice under the same parent via different non-protocol subtrees in one frame, frames/bytes can still be incremented twice. Carry the same set through same-level recursion and only start a fresh one when descending into a child protocol's children.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@lib/wiregasm/lib.cpp` around lines 1272 - 1300, The counted_protocols set is
recreated per call causing duplicate counting across different branches at the
same hierarchy level; modify wg_collect_protocol_hierarchy to accept and
propagate a reference (or pointer) to a counted_protocols set for the current
sibling-level so that when iterating siblings you reuse the same set, and only
create/pass a fresh set when recursing into a protocol node's children (i.e.,
when calling wg_collect_protocol_hierarchy for protocol->children); update all
call sites accordingly so counting (protocol->frames and protocol->bytes) is
guarded by insertion into this shared per-level set.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@lib/wiregasm/lib.cpp`:
- Around line 1127-1142: The current loop breaks when response.rows reaches
limit which leaves metadata (matched and response.total_rows) reflecting only
the scanned prefix; instead, continue scanning all packets after hitting limit
but stop appending rows: when response.rows.size() >= limit set
response.truncated = true and continue processing (calling wtap_rec_reset(&rec)
and epan_dissect_reset(&edt) each iteration) so matched and any counters are
still updated for every record, and only suppress pushing additional rows into
response.rows; adjust the logic around the block that currently breaks to skip
push but not break so response.total_rows and matched reflect the full scan
while preserving limit behavior.
- Around line 1172-1183: The occurrences counter is only incremented when the
field wasn't already seen in the current frame because the increment is guarded
by frame_fields->insert(...). Change the logic in the block handling field
sightings (around frame_fields, present_fields, field_name and finfo->hfinfo in
lib.cpp) so that you always increment PresentField.occurrences whenever the
field is encountered: if present_fields does not contain field_name, create a
PresentField and set occurrences = 1; otherwise increment it
(it->second.occurrences++). Keep the existing frame_fields insert to suppress
per-frame duplicates for whatever purpose it serves, but do not gate the
occurrences increment on that insert.

---

Duplicate comments:
In `@lib/wiregasm/lib.cpp`:
- Around line 1272-1300: The counted_protocols set is recreated per call causing
duplicate counting across different branches at the same hierarchy level; modify
wg_collect_protocol_hierarchy to accept and propagate a reference (or pointer)
to a counted_protocols set for the current sibling-level so that when iterating
siblings you reuse the same set, and only create/pass a fresh set when recursing
into a protocol node's children (i.e., when calling
wg_collect_protocol_hierarchy for protocol->children); update all call sites
accordingly so counting (protocol->frames and protocol->bytes) is guarded by
insertion into this shared per-level set.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: b16539c8-05d7-4fc9-b992-c31de6ef0625

📥 Commits

Reviewing files that changed from the base of the PR and between 82e8070 and 4e718c5.

📒 Files selected for processing (2)

lib/wiregasm/lib.cpp
src/index.ts

coderabbitai · 2026-05-14T16:10:21Z

+      if (!row.values.empty()) {
+        response.rows.push_back(row);
+        if (limit > 0 && response.rows.size() >= limit) {
+          response.truncated = true;
+          wtap_rec_reset(&rec);
+          epan_dissect_reset(&edt);
+          break;
+        }
+      }
+    }
+
+    wtap_rec_reset(&rec);
+    epan_dissect_reset(&edt);
+  }
+
+  response.total_rows = response.rows.size();


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Keep scanning after hitting limit so the metadata stays accurate.

Breaking here makes matched and total_rows describe only the scanned prefix, not the full result set. With truncated = true, callers will still expect the untruncated counts.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@lib/wiregasm/lib.cpp` around lines 1127 - 1142, The current loop breaks when response.rows reaches limit which leaves metadata (matched and response.total_rows) reflecting only the scanned prefix; instead, continue scanning all packets after hitting limit but stop appending rows: when response.rows.size() >= limit set response.truncated = true and continue processing (calling wtap_rec_reset(&rec) and epan_dissect_reset(&edt) each iteration) so matched and any counters are still updated for every record, and only suppress pushing additional rows into response.rows; adjust the logic around the block that currently breaks to skip push but not break so response.total_rows and matched reflect the full scan while preserving limit behavior.

coderabbitai · 2026-05-14T16:10:22Z

+    if (frame_fields->insert(field_name).second) {
+      auto it = present_fields->find(field_name);
+      if (it == present_fields->end()) {
+        PresentField present;
+        present.field = field_name;
+        present.name = finfo->hfinfo->name ? finfo->hfinfo->name : "";
+        present.type = static_cast<int>(finfo->hfinfo->type);
+        present.occurrences = 1;
+        (*present_fields)[field_name] = present;
+      } else {
+        it->second.occurrences++;
+      }


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

occurrences is currently counting frames, not occurrences.

frame_fields->insert(...) suppresses repeated instances inside the same packet, so fields that appear multiple times in one frame still add only 1. That makes the exported occurrences value inaccurate.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@lib/wiregasm/lib.cpp` around lines 1172 - 1183, The occurrences counter is only incremented when the field wasn't already seen in the current frame because the increment is guarded by frame_fields->insert(...). Change the logic in the block handling field sightings (around frame_fields, present_fields, field_name and finfo->hfinfo in lib.cpp) so that you always increment PresentField.occurrences whenever the field is encountered: if present_fields does not contain field_name, create a PresentField and set occurrences = 1; otherwise increment it (it->second.occurrences++). Keep the existing frame_fields insert to suppress per-frame duplicates for whatever purpose it serves, but do not gate the occurrences increment on that insert.

Upstream zlib.net re-published the 1.2.11 tarball with a different hash, breaking the glib subproject download. Update the expected source_hash and add a source_fallback_url pointing to the GitHub release mirror.

dipeshwalia added 3 commits February 19, 2026 16:50

fix: prevent decode-as registration growth while preserving defaults

6732d8b

Avoid duplicate decode-as default reapplication across module traversal and keep PREF_DECODE_AS_RANGE updates merged with protocol defaults to prevent runaway memory growth on Node 20.

feat: add field extraction, present fields, and protocol hierarchy APIs

e4dcfa3

Merge branch 'fix/decode-as-memory-growth' into feat/packet-inspector…

82e8070

…-llm-query-tools

coderabbitai Bot requested changes May 12, 2026

View reviewed changes

coderabbitai Bot requested changes May 14, 2026

View reviewed changes

fix: update zlib wrap hash and add fallback URL for glib build

97e1c91

Upstream zlib.net re-published the 1.2.11 tarball with a different hash, breaking the glib subproject download. Update the expected source_hash and add a source_fallback_url pointing to the GitHub release mirror.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feat/packet inspector query tools#30

Feat/packet inspector query tools#30
dipeshwalia wants to merge 5 commits into
good-tools:masterfrom
dipeshwalia:feat/packet-inspector-llm-query-tools

dipeshwalia commented May 12, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented May 12, 2026 •

edited

Loading

Rate limit exceeded

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot May 12, 2026

Uh oh!

coderabbitai Bot May 12, 2026

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot May 14, 2026

Uh oh!

coderabbitai Bot May 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

dipeshwalia commented May 12, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented May 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Rate limit exceeded

Walkthrough

Changes

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 12, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 12, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 14, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 14, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

dipeshwalia commented May 12, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 12, 2026 •

edited

Loading