Skip to content

Add minimum required permissions documentation#18

Merged
maxcold merged 22 commits into
mainfrom
docs/minimum-required-permissions
May 13, 2026
Merged

Add minimum required permissions documentation#18
maxcold merged 22 commits into
mainfrom
docs/minimum-required-permissions

Conversation

@maxcold

@maxcold maxcold commented Apr 23, 2026

Copy link
Copy Markdown
Collaborator

Summary

Closes #13

  • Adds docs/permissions.md with least-privilege role definitions for the MCP app
  • Two pre-built roles: full-featured (all tools, read+write) and read-only (view only)
  • Copy-pasteable Dev Tools commands for PUT /_security/role and POST /_security/api_key
  • Version-specific Kibana feature privilege tables (9.0-9.2, 9.3, 9.4+)
  • Per-tool privilege appendix for custom role builders
  • Troubleshooting section (401, 403, space ID mismatch)
  • Links from all setup guides to the new permissions doc

Test plan

  • Verify Dev Tools PUT /_security/role/mcp_app_full command works on a 9.4+ cluster
  • Verify POST /_security/api_key with full-featured role descriptors creates a working key
  • Verify all MCP tools work with the full-featured API key
  • Verify read-only role blocks write operations (acknowledge, create case, etc.)
  • Check all internal doc links resolve correctly

maxcold added 17 commits April 23, 2026 15:28
Closes #13. Adds docs/permissions.md with least-privilege role
definitions (full-featured and read-only), copy-pasteable Dev Tools
commands, per-tool privilege appendix, troubleshooting guide, and
version-specific Kibana feature privilege tables (9.0-9.2, 9.3, 9.4+).
Links from all setup guides.
The new `npm run test:permissions` runner provisions both documented
roles (full and read-only), creates scoped API keys, and exercises every
documented operation through `src/elastic/*` to verify the role
definitions in `docs/permissions.md` actually work end-to-end. Layer A
calls `_has_privileges`; Layer B exercises real APIs and asserts 200 vs
expected 403.

Running the tooling surfaced a real gap: `_cat/indices/<pattern>` (used
by Threat Hunt's listIndices) requires the index-level `monitor`
privilege, not just cluster-level `monitor`. Both roles in
`docs/permissions.md` and the source-of-truth `roles.ts` now grant
index-level `monitor` on data and system indices.
acknowledgeAlert called `_update/{id}` against the `.alerts-security.alerts-*`
wildcard, which Elasticsearch rejects with a 400. The UI's empty catch
block silently swallowed the error, so every "Benign" verdict click
since the feature shipped was a no-op — even under superuser.
acknowledgeAlert now delegates to acknowledgeAlerts (which uses
`_update_by_query` and supports wildcards).

Running test:permissions with the unified path then surfaced a deeper
gap: `_update_by_query` and `_delete_by_query` dispatch writes to
backing indices directly (e.g. `.internal.alerts-security.alerts-default-000001`),
bypassing the data-stream alias. The documented role grants privileges
on the alias only, so acknowledge/cleanup operations 403 under the
scoped role even though they worked under superuser.

Adds the three `.internal.*-*` backing-index patterns to both
roles.ts and docs/permissions.md so the documented role actually works
end-to-end. Test runner now reports 38 passed, 0 failed.
Extends the runner to provision and verify the built-in + companion-role
Quickstart path alongside the existing custom-role assertions:

- New asserted roles: quickstart_full (editor + companion) and
  quickstart_readonly (viewer + companion). Both run the same op suite
  with the same expect matrix as their custom-role counterparts.
- Companion role descriptors carry only cluster/index privileges; Kibana
  feature grants come from the built-in. Cluster `monitor` is required
  alongside index-level `monitor` for `_cat/indices` (Threat Hunt).
- API key minting via grant_api_key (admin auth, target-user creds) so
  the suite works with built-ins that lack `manage_own_api_key` (e.g.
  viewer).
- Up-front roleExists() check before user creation: ES silently accepts
  users with non-existent roles, so we have to detect missing reserved
  roles ourselves.
- --cleanup-stale also sweeps mcp-app-test-* users.
- Helpers: createUser, deleteUser, listUsersByPrefix, roleExists,
  grantApiKeyForUser.

Custom readonly role widened to feature_actions.read so that listing AI
connectors (a read op the MCP app uses) succeeds for read-only users
and matches viewer's behavior — keeping the two readonly paths aligned.
Restructures docs/permissions.md to make the built-in path the default
recommendation:

- Quickstart leads: assign editor (or viewer) + a small companion role
  carrying cluster `monitor` and index privileges on the alert/log
  patterns. No Kibana feature toggles needed — editor/viewer cover the
  whole MCP-app surface (SIEM, Cases, Timeline, Notes, Rules, Alerts,
  AI Assistant, Attack Discovery, Actions/Connectors). Eliminates the
  feature_*V<N> version churn that plagued the custom-role path.
- Advanced section retained for scripted/IaC users — full custom-role
  JSON unchanged (versioned feature tables for 9.0-9.4+), with a
  simplified two-step API key recipe (create user, mint key — no
  role_descriptors needed since the key inherits user privileges).
- Read-only role widened to grant Actions and Connectors (Read) so it
  aligns with viewer and lets the MCP app list AI connectors.
- Stateful (self-managed / ESS) only; serverless variant deferred.
Verifies the .internal.alerts-security.attack.discovery.alerts-default-*
backing-index pattern PR #18 added to the role definitions. Until now
the pattern was unverified — acknowledgeAlerts covered the alerts data
stream's backing index, but the discovery data stream is separate.

- ensureDiscoveryFixture() in preflight: reuses an existing discovery
  if present, otherwise seeds a minimal one via _bulk. Captures
  discoveryId in SeedFixtures.
- New operationCheck calls _update_by_query directly via esRequest,
  bypassing the production acknowledgeDiscoveries() helper which
  silently catches per-index errors (a 403 there returns updated:0
  instead of throwing, masking the privilege gap).
- expect: { full: ok, readonly: 403 } for asserted roles; quickstart
  variants inherit the same profile and pass on stateful 9.5.

Tamper test (drop the discovery backing-index pattern, re-run --role
full) flips the check to 403 with the error message naming
.internal.alerts-security.attack.discovery.alerts-default-000001
exactly. Privilege confirmed load-bearing; appendix already lists it.
Reuses H1's discoveryId fixture to drive two adjacent attack-discovery
operations that until now had no test coverage:

- assessConfidence: runs ES|QL across .alerts-security.alerts-* and
  risk-score.risk-score-latest-* to build a confidence score. The
  risk-score read pattern is in DATA_INDICES but was never exercised.
- getDiscoveryDetail: pulls discovery alert details via ES|QL on
  .alerts-security.alerts-*.

Both expected ok/ok for full and readonly. Tests synthesize a minimal
AttackDiscovery from the seeded discoveryId + a real alertId so the
inner queries actually run (the helpers return early when alertIds is
empty).

Smoke: --role all → 86 passed, 0 failed (8 new check executions,
4 roles x 2 ops).
Bypasses cleanupSampleData() (silently catches per-index errors) by
calling _delete_by_query directly on two representative data streams:

- cleanupSampleDataLogs: logs-endpoint.events.process-default — would
  surface a backing-index gap if `.ds-logs-*` privileges were needed,
  same shape as the issue PR #18 fixed for alerts.
- cleanupSampleDataAlerts: .alerts-security.alerts-default — companion
  to acknowledgeAlert (which uses _update_by_query); exercises the
  delete arm of the same backing-index privilege.

Finding: the logs-data-stream hypothesis is refuted on 9.5 stateful.
_delete_by_query works at the data-stream-alias level — no
.ds-logs-* pattern needed in DATA_INDICES. logs-* umbrella suffices.
No doc updates required.

Smoke: --role all → 94 passed, 0 failed (8 new check executions,
4 roles x 2 ops).
Quick Start and Troubleshooting both referenced ELASTICSEARCH_API_KEY,
but loadAdminBasics() actually requires ELASTIC_PASSWORD (with optional
ELASTIC_USERNAME defaulting to "elastic"). Following the README
literally produced the "must be set" die() message immediately.

Reworded the manage_security aside to talk about the user instead of
the API key, since the runner uses basic auth to bootstrap.
The per-tool row listed only logs-*, but generateSampleData() also
indexes alert docs to .alerts-security.alerts-<space-id> via /_bulk
(see sample-data.ts:350). Custom roles built from this row alone 403
on the alert writes.

Mirrored the Cleanup sample data row's index list and feature
privileges so both rows agree.
Replace 'sign in as the user' instructions with POST /_security/api_key/grant
calls run by an admin, since most setups won't have an interactive session as
the freshly-created user.
Adds two coordinated entries to the operationChecks matrix in
scripts/test-permissions/roles.ts, both bucketed under the `alerts`
group:

- `getAlertContext` exercises the function end-to-end. Picks an alert
  with host.name set so the parallel reads against
  logs-endpoint.events.process-*, logs-endpoint.events.network-*, and
  .alerts-security.alerts-* all dispatch (the production code
  short-circuits the endpoint queries when host.name is missing).

- `endpointEventsReadable` is a companion _has_privileges probe that
  verifies the role grants read on the endpoint-event indices. Needed
  because ES _search against a wildcard pattern is lenient: when the
  role has zero concrete indices matching the pattern, ES returns 0
  hits silently instead of 403, so the function-call check above
  cannot detect a privilege gap on those indices. The probe closes
  that gap and is a reusable pattern for any future op that reads via
  wildcard _search (e.g. M3 getMapping, M4 getEntityDetail).

Tamper-test outcome: replacing "logs-*" with "logs-foo-*" in
DATA_INDICES correctly fails `endpointEventsReadable` for both `full`
and `readonly` (and their quickstart variants) with an error naming
both `logs-endpoint.events.process-*` and `logs-endpoint.events.
network-*`. The `getAlertContext` check itself stayed green during
the tamper run, confirming the wildcard _search leniency motivating
the probe. Restoring DATA_INDICES to "logs-*" returns both checks to
green for all four asserted role identities (full, readonly,
quickstart_full, quickstart_readonly): 102 passed, 0 failed.

The "View alert context" appendix row in docs/permissions.md already
correctly lists read on .alerts-security.alerts-<space-id>,
logs-endpoint.events.process-*, and logs-endpoint.events.network-*
plus Security (Read) + Alerts (Read), so no doc edit was needed.
Adds a `getMapping` operation to the permissions test matrix and fixes
the load-bearing privilege gap the new check immediately surfaced.

The check calls `getMapping(f.alertIndex)`, exercising
`indices:admin/mappings/get`. Both `full` and `readonly` are expected
to succeed.

Discovery via TDD: the new check correctly failed for `full`,
`readonly`, AND `quickstart_readonly` in the first untampered run.
Investigation showed the failing privilege is `view_index_metadata`,
NOT `monitor` — the two are independent privileges in Elasticsearch
(see https://www.elastic.co/guide/en/elasticsearch/reference/current/security-privileges.html).
`monitor` covers `indices:monitor/*` (used by `_cat/indices`), but
`_mapping` dispatches `indices:admin/mappings/get`, which is gated by
`view_index_metadata`. `quickstart_full` previously passed only
because the built-in `editor` Kibana role's auto-generated descriptors
already grant `view_index_metadata` on `.internal.alerts-*`; `viewer`
does not, so `quickstart_readonly` failed too.

Fix: add `view_index_metadata` to the index privileges list of all
four asserted role descriptors (`fullRole`, `readonlyRole`, and both
`QUICKSTART_COMPANION_DESCRIPTORS`). With the fix in place all 106
checks pass; without it, `getMapping` returns 403 cleanly for the
roles that lack `view_index_metadata`.

Doc updates (`docs/permissions.md`):
- Quickstart companion role tables (full + readonly) — add
  `view_index_metadata` to every row, and rewrite the explanatory
  note to call out that `monitor` and `view_index_metadata` are
  separate privileges.
- Advanced custom-role privilege bundles ("System / alert indices",
  "Data indices", "All index patterns") and the corresponding Dev
  Tools `PUT /_security/role/...` JSON snippets — add
  `view_index_metadata` to every privileges array.
- Per-tool appendix Threat Hunt "Field mappings" row — change
  `monitor` to `view_index_metadata` for the index-privileges column.

TDD-loop outcomes:
- Tamper 1 (drop `monitor` per the handover): getMapping AND
  listIndices fail for `full`/`readonly`; `quickstart_full` passes,
  `quickstart_readonly` fails on getMapping. Surfaced the real
  privilege model (see Discovery above).
- Restore + add `view_index_metadata`: all 106 checks pass for all
  four asserted role identities.
- Tamper 2 (drop `view_index_metadata` from `fullRole`/`readonlyRole`
  only, leave companions intact): getMapping fails for `full`/
  `readonly` with 403 (listIndices unaffected); both quickstart
  variants pass — clean isolation, privilege-driven failure confirmed.
- Final restore + re-run: 106 passed, 0 failed, 0 skipped across
  `full`, `readonly`, `quickstart_full`, `quickstart_readonly`.
Kibana's bulk endpoints (e.g. /api/detection_engine/rules/_bulk_action)
return HTTP 500 at the top level when an individual operation is denied,
with the per-item 403 buried inside the response body as
`"status_code":403` under `attributes.errors[]`. The existing regex set
in isPermissionDenied() only matched `"status":403` (the bulk-API
inline-error shape), so any op going through one of these endpoints was
classified `other` instead of `403` and broke its expectation.

Adds one regex to cover the embedded form. Surfaced while wiring the
upcoming bulkAction check; benefits any future op hitting a Kibana
bulk endpoint.
Adds a `bulkAction` entry to the test-permissions matrix to assert that
`feature_securitySolutionRulesV4.all` actually grants the `_bulk_action`
action on Kibana's detection-engine API. Uses `action: "duplicate"` —
the only non-destructive option that exercises the write privilege —
and inline-deletes the created `[Duplicate]` rule(s) so successful runs
are idempotent. countLeftoverTaggedResources() filters on
tags:"mcp-app-test", which the duplicate inherits only if the source
rule had it (it usually doesn't), so the inline cleanup is the
self-healing path.

Tamper test: changing fullRole's `feature_securitySolutionRulesV4.all`
to `.read` causes bulkAction to fail with 403 alongside createRule and
patchRule, confirming the privilege requirement matches the doc. The
quickstart_full variant (built-in `editor`) was unaffected by the
tamper, isolating the privilege correctly. Restored run: 110 passed,
0 failed across full, readonly, quickstart_full, quickstart_readonly.

Doc: added an explicit "Bulk action on rules (`_bulk_action`)" row to
the Detection Rules section of the per-tool appendix in
docs/permissions.md. Same privilege requirements as the existing
single-rule write ops, but the API endpoint deserves discoverability.
Adds the last two ship-blocking checks for PR #18, covering the
exception-list sub-feature of the Rules feature:

- listExceptions: GET /api/exception_lists/items/_find — read on the
  exception-list-management namespace.
- addException: POST /api/detection_engine/rules/{rule_id}/exceptions —
  write under the detection-engine namespace.

Tamper test (feature_securitySolutionRulesV4.all → .read on fullRole):
addException flips to 403 alongside createRule/patchRule/bulkAction,
confirming it requires .all. listExceptions continues to pass under
.read, confirming it's truly read-only and shares the Rules.read scope
— no separate sub-feature privilege gates exception listing today.
Quickstart variants (built-in editor/viewer) pass through the tamper
unchanged as a positive control.

Fixture: introduces ensureExceptionListFixture() in runner preflight,
mirroring ensureDiscoveryFixture from H1. Creates a transient
detection-type exception list (list_id: mcp-app-test-list-<hex>) so
the checks have a deterministic fixture across environments — the
plan's original endpoint_list assumption fails on vanilla clusters
where the Endpoint Security integration isn't enabled (404 from _find
classifies the check as "no fixture", privilege never exercised).
Cleanup tears the list down at end of run alongside the role/key
teardowns, including the SIGINT path. Per-item cleanup of created
exception items happens inline in the addException check.

Doc: added "List exceptions" row to the Detection Rules per-tool
appendix in docs/permissions.md (Security Read, Rules and Exceptions
Read). Existing "Add exceptions" row at line 427 verified accurate.
Two doc additions surfaced by the PR #18 review thread (Leandro Pereira
+ Ameer Mukadam exercising docs/permissions.md on real self-managed
clusters):

- README.md: add a "permissions" paragraph to the Quick Start tip
  block. Links to docs/permissions.md, recommends the Quickstart path,
  names the built-in `editor` / `viewer` roles directly. Closes the
  loop on James's request from 3 weeks ago to surface the permissions
  doc from the README.

- docs/permissions.md: add a hedged Troubleshooting entry for the
  deprecated SIEM feature gap. On some Kibana versions, the privilege
  actions for `api:rules-read` haven't fully migrated from
  `feature_siemV4` to its replacements; granting the documented
  `feature_securitySolutionRulesV4.read` alone returns 403. Workaround:
  also grant `feature_siemV4.read` (and `.all` for the full-featured
  role). The note is deliberately version-neutral until we can
  reproduce — full investigation tracked separately.
@maxcold maxcold marked this pull request as ready for review May 8, 2026 11:00
…-permissions

# Conflicts:
#	src/elastic/alerts.ts
@maxcold maxcold requested a review from KDKHD May 11, 2026 13:46
Comment thread docs/permissions.md

</details>

#### Dev Tools: Create the role

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe these dev-tool commands should go at the top of this file. Creating a new role in the UI, with permissions for these indices, is much more tedious than running this command in dev tools.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

makes sense, will add the section to the very top with the Dev Console request, thanks!


```bash
# Make sure .env has ELASTICSEARCH_URL, KIBANA_URL, ELASTIC_PASSWORD
# (and ELASTIC_USERNAME if not the default "elastic").

@KDKHD KDKHD May 12, 2026

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The MCP app uses an API key for auth - could we keep it consistent and use the same env variables? Then users dont need to maintain 2 sets of credentials

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The script is used for development and needs admin-level credentials that are distinct from the runtime MCP key:

  • It calls PUT /_security/role/... (needs manage_security), POST /_security/api_key/grant with grant_type: password (acts on behalf of a user), and seeds sample data into logs-* and .alerts-security.alerts-*.
  • MCP ELASTICSEARCH_API_KEY is by design least-privilege over those alert/case indices and does not carry manage_security. So even if we accepted the same env var name, users would have to maintain a second, a more privileged key under it to run the test runner.
  • Password-based api_key/grant is also the cleanest local-dev path because it works out of the box with elastic:changeme and produces user-scoped keys, which is what we want for testing.

KDKHD
KDKHD previously approved these changes May 12, 2026
maxcold added 2 commits May 13, 2026 09:43
…-permissions

# Conflicts:
#	README.md
#	docs/setup-claude-code.md
#	docs/setup-cursor.md
#	docs/setup-vscode.md
@maxcold maxcold requested a review from KDKHD May 13, 2026 09:14
@maxcold maxcold merged commit 53a38c6 into main May 13, 2026
2 checks passed
@maxcold maxcold deleted the docs/minimum-required-permissions branch May 13, 2026 09:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Enhancement Request] Provide better documentation on role setup

2 participants