Add minimum required permissions documentation#18
Merged
Conversation
Closes #13. Adds docs/permissions.md with least-privilege role definitions (full-featured and read-only), copy-pasteable Dev Tools commands, per-tool privilege appendix, troubleshooting guide, and version-specific Kibana feature privilege tables (9.0-9.2, 9.3, 9.4+). Links from all setup guides.
The new `npm run test:permissions` runner provisions both documented roles (full and read-only), creates scoped API keys, and exercises every documented operation through `src/elastic/*` to verify the role definitions in `docs/permissions.md` actually work end-to-end. Layer A calls `_has_privileges`; Layer B exercises real APIs and asserts 200 vs expected 403. Running the tooling surfaced a real gap: `_cat/indices/<pattern>` (used by Threat Hunt's listIndices) requires the index-level `monitor` privilege, not just cluster-level `monitor`. Both roles in `docs/permissions.md` and the source-of-truth `roles.ts` now grant index-level `monitor` on data and system indices.
acknowledgeAlert called `_update/{id}` against the `.alerts-security.alerts-*`
wildcard, which Elasticsearch rejects with a 400. The UI's empty catch
block silently swallowed the error, so every "Benign" verdict click
since the feature shipped was a no-op — even under superuser.
acknowledgeAlert now delegates to acknowledgeAlerts (which uses
`_update_by_query` and supports wildcards).
Running test:permissions with the unified path then surfaced a deeper
gap: `_update_by_query` and `_delete_by_query` dispatch writes to
backing indices directly (e.g. `.internal.alerts-security.alerts-default-000001`),
bypassing the data-stream alias. The documented role grants privileges
on the alias only, so acknowledge/cleanup operations 403 under the
scoped role even though they worked under superuser.
Adds the three `.internal.*-*` backing-index patterns to both
roles.ts and docs/permissions.md so the documented role actually works
end-to-end. Test runner now reports 38 passed, 0 failed.
Extends the runner to provision and verify the built-in + companion-role Quickstart path alongside the existing custom-role assertions: - New asserted roles: quickstart_full (editor + companion) and quickstart_readonly (viewer + companion). Both run the same op suite with the same expect matrix as their custom-role counterparts. - Companion role descriptors carry only cluster/index privileges; Kibana feature grants come from the built-in. Cluster `monitor` is required alongside index-level `monitor` for `_cat/indices` (Threat Hunt). - API key minting via grant_api_key (admin auth, target-user creds) so the suite works with built-ins that lack `manage_own_api_key` (e.g. viewer). - Up-front roleExists() check before user creation: ES silently accepts users with non-existent roles, so we have to detect missing reserved roles ourselves. - --cleanup-stale also sweeps mcp-app-test-* users. - Helpers: createUser, deleteUser, listUsersByPrefix, roleExists, grantApiKeyForUser. Custom readonly role widened to feature_actions.read so that listing AI connectors (a read op the MCP app uses) succeeds for read-only users and matches viewer's behavior — keeping the two readonly paths aligned.
Restructures docs/permissions.md to make the built-in path the default recommendation: - Quickstart leads: assign editor (or viewer) + a small companion role carrying cluster `monitor` and index privileges on the alert/log patterns. No Kibana feature toggles needed — editor/viewer cover the whole MCP-app surface (SIEM, Cases, Timeline, Notes, Rules, Alerts, AI Assistant, Attack Discovery, Actions/Connectors). Eliminates the feature_*V<N> version churn that plagued the custom-role path. - Advanced section retained for scripted/IaC users — full custom-role JSON unchanged (versioned feature tables for 9.0-9.4+), with a simplified two-step API key recipe (create user, mint key — no role_descriptors needed since the key inherits user privileges). - Read-only role widened to grant Actions and Connectors (Read) so it aligns with viewer and lets the MCP app list AI connectors. - Stateful (self-managed / ESS) only; serverless variant deferred.
Verifies the .internal.alerts-security.attack.discovery.alerts-default-* backing-index pattern PR #18 added to the role definitions. Until now the pattern was unverified — acknowledgeAlerts covered the alerts data stream's backing index, but the discovery data stream is separate. - ensureDiscoveryFixture() in preflight: reuses an existing discovery if present, otherwise seeds a minimal one via _bulk. Captures discoveryId in SeedFixtures. - New operationCheck calls _update_by_query directly via esRequest, bypassing the production acknowledgeDiscoveries() helper which silently catches per-index errors (a 403 there returns updated:0 instead of throwing, masking the privilege gap). - expect: { full: ok, readonly: 403 } for asserted roles; quickstart variants inherit the same profile and pass on stateful 9.5. Tamper test (drop the discovery backing-index pattern, re-run --role full) flips the check to 403 with the error message naming .internal.alerts-security.attack.discovery.alerts-default-000001 exactly. Privilege confirmed load-bearing; appendix already lists it.
Reuses H1's discoveryId fixture to drive two adjacent attack-discovery operations that until now had no test coverage: - assessConfidence: runs ES|QL across .alerts-security.alerts-* and risk-score.risk-score-latest-* to build a confidence score. The risk-score read pattern is in DATA_INDICES but was never exercised. - getDiscoveryDetail: pulls discovery alert details via ES|QL on .alerts-security.alerts-*. Both expected ok/ok for full and readonly. Tests synthesize a minimal AttackDiscovery from the seeded discoveryId + a real alertId so the inner queries actually run (the helpers return early when alertIds is empty). Smoke: --role all → 86 passed, 0 failed (8 new check executions, 4 roles x 2 ops).
Bypasses cleanupSampleData() (silently catches per-index errors) by calling _delete_by_query directly on two representative data streams: - cleanupSampleDataLogs: logs-endpoint.events.process-default — would surface a backing-index gap if `.ds-logs-*` privileges were needed, same shape as the issue PR #18 fixed for alerts. - cleanupSampleDataAlerts: .alerts-security.alerts-default — companion to acknowledgeAlert (which uses _update_by_query); exercises the delete arm of the same backing-index privilege. Finding: the logs-data-stream hypothesis is refuted on 9.5 stateful. _delete_by_query works at the data-stream-alias level — no .ds-logs-* pattern needed in DATA_INDICES. logs-* umbrella suffices. No doc updates required. Smoke: --role all → 94 passed, 0 failed (8 new check executions, 4 roles x 2 ops).
Quick Start and Troubleshooting both referenced ELASTICSEARCH_API_KEY, but loadAdminBasics() actually requires ELASTIC_PASSWORD (with optional ELASTIC_USERNAME defaulting to "elastic"). Following the README literally produced the "must be set" die() message immediately. Reworded the manage_security aside to talk about the user instead of the API key, since the runner uses basic auth to bootstrap.
The per-tool row listed only logs-*, but generateSampleData() also indexes alert docs to .alerts-security.alerts-<space-id> via /_bulk (see sample-data.ts:350). Custom roles built from this row alone 403 on the alert writes. Mirrored the Cleanup sample data row's index list and feature privileges so both rows agree.
Replace 'sign in as the user' instructions with POST /_security/api_key/grant calls run by an admin, since most setups won't have an interactive session as the freshly-created user.
Adds two coordinated entries to the operationChecks matrix in scripts/test-permissions/roles.ts, both bucketed under the `alerts` group: - `getAlertContext` exercises the function end-to-end. Picks an alert with host.name set so the parallel reads against logs-endpoint.events.process-*, logs-endpoint.events.network-*, and .alerts-security.alerts-* all dispatch (the production code short-circuits the endpoint queries when host.name is missing). - `endpointEventsReadable` is a companion _has_privileges probe that verifies the role grants read on the endpoint-event indices. Needed because ES _search against a wildcard pattern is lenient: when the role has zero concrete indices matching the pattern, ES returns 0 hits silently instead of 403, so the function-call check above cannot detect a privilege gap on those indices. The probe closes that gap and is a reusable pattern for any future op that reads via wildcard _search (e.g. M3 getMapping, M4 getEntityDetail). Tamper-test outcome: replacing "logs-*" with "logs-foo-*" in DATA_INDICES correctly fails `endpointEventsReadable` for both `full` and `readonly` (and their quickstart variants) with an error naming both `logs-endpoint.events.process-*` and `logs-endpoint.events. network-*`. The `getAlertContext` check itself stayed green during the tamper run, confirming the wildcard _search leniency motivating the probe. Restoring DATA_INDICES to "logs-*" returns both checks to green for all four asserted role identities (full, readonly, quickstart_full, quickstart_readonly): 102 passed, 0 failed. The "View alert context" appendix row in docs/permissions.md already correctly lists read on .alerts-security.alerts-<space-id>, logs-endpoint.events.process-*, and logs-endpoint.events.network-* plus Security (Read) + Alerts (Read), so no doc edit was needed.
Adds a `getMapping` operation to the permissions test matrix and fixes the load-bearing privilege gap the new check immediately surfaced. The check calls `getMapping(f.alertIndex)`, exercising `indices:admin/mappings/get`. Both `full` and `readonly` are expected to succeed. Discovery via TDD: the new check correctly failed for `full`, `readonly`, AND `quickstart_readonly` in the first untampered run. Investigation showed the failing privilege is `view_index_metadata`, NOT `monitor` — the two are independent privileges in Elasticsearch (see https://www.elastic.co/guide/en/elasticsearch/reference/current/security-privileges.html). `monitor` covers `indices:monitor/*` (used by `_cat/indices`), but `_mapping` dispatches `indices:admin/mappings/get`, which is gated by `view_index_metadata`. `quickstart_full` previously passed only because the built-in `editor` Kibana role's auto-generated descriptors already grant `view_index_metadata` on `.internal.alerts-*`; `viewer` does not, so `quickstart_readonly` failed too. Fix: add `view_index_metadata` to the index privileges list of all four asserted role descriptors (`fullRole`, `readonlyRole`, and both `QUICKSTART_COMPANION_DESCRIPTORS`). With the fix in place all 106 checks pass; without it, `getMapping` returns 403 cleanly for the roles that lack `view_index_metadata`. Doc updates (`docs/permissions.md`): - Quickstart companion role tables (full + readonly) — add `view_index_metadata` to every row, and rewrite the explanatory note to call out that `monitor` and `view_index_metadata` are separate privileges. - Advanced custom-role privilege bundles ("System / alert indices", "Data indices", "All index patterns") and the corresponding Dev Tools `PUT /_security/role/...` JSON snippets — add `view_index_metadata` to every privileges array. - Per-tool appendix Threat Hunt "Field mappings" row — change `monitor` to `view_index_metadata` for the index-privileges column. TDD-loop outcomes: - Tamper 1 (drop `monitor` per the handover): getMapping AND listIndices fail for `full`/`readonly`; `quickstart_full` passes, `quickstart_readonly` fails on getMapping. Surfaced the real privilege model (see Discovery above). - Restore + add `view_index_metadata`: all 106 checks pass for all four asserted role identities. - Tamper 2 (drop `view_index_metadata` from `fullRole`/`readonlyRole` only, leave companions intact): getMapping fails for `full`/ `readonly` with 403 (listIndices unaffected); both quickstart variants pass — clean isolation, privilege-driven failure confirmed. - Final restore + re-run: 106 passed, 0 failed, 0 skipped across `full`, `readonly`, `quickstart_full`, `quickstart_readonly`.
Kibana's bulk endpoints (e.g. /api/detection_engine/rules/_bulk_action) return HTTP 500 at the top level when an individual operation is denied, with the per-item 403 buried inside the response body as `"status_code":403` under `attributes.errors[]`. The existing regex set in isPermissionDenied() only matched `"status":403` (the bulk-API inline-error shape), so any op going through one of these endpoints was classified `other` instead of `403` and broke its expectation. Adds one regex to cover the embedded form. Surfaced while wiring the upcoming bulkAction check; benefits any future op hitting a Kibana bulk endpoint.
Adds a `bulkAction` entry to the test-permissions matrix to assert that `feature_securitySolutionRulesV4.all` actually grants the `_bulk_action` action on Kibana's detection-engine API. Uses `action: "duplicate"` — the only non-destructive option that exercises the write privilege — and inline-deletes the created `[Duplicate]` rule(s) so successful runs are idempotent. countLeftoverTaggedResources() filters on tags:"mcp-app-test", which the duplicate inherits only if the source rule had it (it usually doesn't), so the inline cleanup is the self-healing path. Tamper test: changing fullRole's `feature_securitySolutionRulesV4.all` to `.read` causes bulkAction to fail with 403 alongside createRule and patchRule, confirming the privilege requirement matches the doc. The quickstart_full variant (built-in `editor`) was unaffected by the tamper, isolating the privilege correctly. Restored run: 110 passed, 0 failed across full, readonly, quickstart_full, quickstart_readonly. Doc: added an explicit "Bulk action on rules (`_bulk_action`)" row to the Detection Rules section of the per-tool appendix in docs/permissions.md. Same privilege requirements as the existing single-rule write ops, but the API endpoint deserves discoverability.
Adds the last two ship-blocking checks for PR #18, covering the exception-list sub-feature of the Rules feature: - listExceptions: GET /api/exception_lists/items/_find — read on the exception-list-management namespace. - addException: POST /api/detection_engine/rules/{rule_id}/exceptions — write under the detection-engine namespace. Tamper test (feature_securitySolutionRulesV4.all → .read on fullRole): addException flips to 403 alongside createRule/patchRule/bulkAction, confirming it requires .all. listExceptions continues to pass under .read, confirming it's truly read-only and shares the Rules.read scope — no separate sub-feature privilege gates exception listing today. Quickstart variants (built-in editor/viewer) pass through the tamper unchanged as a positive control. Fixture: introduces ensureExceptionListFixture() in runner preflight, mirroring ensureDiscoveryFixture from H1. Creates a transient detection-type exception list (list_id: mcp-app-test-list-<hex>) so the checks have a deterministic fixture across environments — the plan's original endpoint_list assumption fails on vanilla clusters where the Endpoint Security integration isn't enabled (404 from _find classifies the check as "no fixture", privilege never exercised). Cleanup tears the list down at end of run alongside the role/key teardowns, including the SIGINT path. Per-item cleanup of created exception items happens inline in the addException check. Doc: added "List exceptions" row to the Detection Rules per-tool appendix in docs/permissions.md (Security Read, Rules and Exceptions Read). Existing "Add exceptions" row at line 427 verified accurate.
Two doc additions surfaced by the PR #18 review thread (Leandro Pereira + Ameer Mukadam exercising docs/permissions.md on real self-managed clusters): - README.md: add a "permissions" paragraph to the Quick Start tip block. Links to docs/permissions.md, recommends the Quickstart path, names the built-in `editor` / `viewer` roles directly. Closes the loop on James's request from 3 weeks ago to surface the permissions doc from the README. - docs/permissions.md: add a hedged Troubleshooting entry for the deprecated SIEM feature gap. On some Kibana versions, the privilege actions for `api:rules-read` haven't fully migrated from `feature_siemV4` to its replacements; granting the documented `feature_securitySolutionRulesV4.read` alone returns 403. Workaround: also grant `feature_siemV4.read` (and `.all` for the full-featured role). The note is deliberately version-neutral until we can reproduce — full investigation tracked separately.
…-permissions # Conflicts: # src/elastic/alerts.ts
KDKHD
reviewed
May 11, 2026
|
|
||
| </details> | ||
|
|
||
| #### Dev Tools: Create the role |
Member
There was a problem hiding this comment.
Maybe these dev-tool commands should go at the top of this file. Creating a new role in the UI, with permissions for these indices, is much more tedious than running this command in dev tools.
Collaborator
Author
There was a problem hiding this comment.
makes sense, will add the section to the very top with the Dev Console request, thanks!
KDKHD
reviewed
May 12, 2026
|
|
||
| ```bash | ||
| # Make sure .env has ELASTICSEARCH_URL, KIBANA_URL, ELASTIC_PASSWORD | ||
| # (and ELASTIC_USERNAME if not the default "elastic"). |
Member
There was a problem hiding this comment.
The MCP app uses an API key for auth - could we keep it consistent and use the same env variables? Then users dont need to maintain 2 sets of credentials
Collaborator
Author
There was a problem hiding this comment.
The script is used for development and needs admin-level credentials that are distinct from the runtime MCP key:
- It calls
PUT /_security/role/...(needsmanage_security),POST /_security/api_key/grantwithgrant_type: password(acts on behalf of a user), and seeds sample data intologs-*and.alerts-security.alerts-*. MCP ELASTICSEARCH_API_KEYis by design least-privilege over those alert/case indices and does not carrymanage_security. So even if we accepted the same env var name, users would have to maintain a second, a more privileged key under it to run the test runner.- Password-based api_key/grant is also the cleanest local-dev path because it works out of the box with elastic:changeme and produces user-scoped keys, which is what we want for testing.
KDKHD
previously approved these changes
May 12, 2026
…-permissions # Conflicts: # README.md # docs/setup-claude-code.md # docs/setup-cursor.md # docs/setup-vscode.md
KDKHD
approved these changes
May 13, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Closes #13
docs/permissions.mdwith least-privilege role definitions for the MCP appPUT /_security/roleandPOST /_security/api_keyTest plan
PUT /_security/role/mcp_app_fullcommand works on a 9.4+ clusterPOST /_security/api_keywith full-featured role descriptors creates a working key