fix: set final statusMessage before storeTaskResult (#638) by jirispilka · Pull Request #677 · apify/apify-mcp-server

jirispilka · 2026-04-15T13:27:57Z

Summary

storeTaskResult() does not accept a statusMessage parameter, so it left the previous intermediate progress message on the task after completion
Calls updateTaskStatus(taskId, 'working', '<tool>: completed|failed|payment required') immediately before each storeTaskResult() so the final message is in place before the terminal transition
Verified end-to-end via mcpc: tasks/list now shows "apify/rag-web-browser: completed" instead of the stale "Starting the crawler."

Test plan

Unit tests: npm run test:unit (2 regression cases in task.statusMessage.test.ts)
mcpc smoke test: call apify/rag-web-browser, check tasks/list → statusMessage shows "<tool>: completed"

🤖 Generated with Claude Code

storeTaskResult() has no statusMessage parameter, so the last intermediate progress message (e.g. "Starting the crawler.") persisted as the final statusMessage on completed/failed tasks. Call updateTaskStatus() with a descriptive final message right before each storeTaskResult() call. https://claude.ai/code/session_01ABYjPsZxtCdib54jfGcx3f

* fix: strip all invalid HTTP header characters in sanitizeEnvValue The previous regex only removed \r\n but Node.js rejects all control characters outside [\t\x20-\x7e\x80-\xff] with ERR_INVALID_CHAR. CI secrets can contain null bytes or other control chars that slipped through. https://claude.ai/code/session_01Kj7rEHNuhTG5U7NrM9RWaG * fix: sanitize env vars in-place so third-party libs get clean values The OTel OTLP exporter (used by @arizeai/phoenix-otel) reads PHOENIX_API_KEY directly from process.env via getEnvApiKey(), bypassing our sanitizeEnvValue() wrapper. It then passes the raw value to node:http which throws ERR_INVALID_CHAR. Add sanitizeProcessEnv() that rewrites sensitive env vars on process.env itself, called right after dotenv.config(). This ensures every reader — including third-party libraries — gets clean values. https://claude.ai/code/session_01Kj7rEHNuhTG5U7NrM9RWaG * fix: simplify regex, improve comments, add redacted env logging - Replace cryptic allowlist regex with readable control-char blocklist - Explain why in-place sanitization is needed (phoenix-otel reads env directly) - Log redacted env var values during sanitizeProcessEnv() for CI debugging https://claude.ai/code/session_01Kj7rEHNuhTG5U7NrM9RWaG * fix: simplify tests for sanitizeEnvValue and sanitizeProcessEnv * docs: update DEVELOPMENT.md to include LLM evals * fix: improve value redaction logic for safer logging --------- Co-authored-by: Claude <noreply@anthropic.com>

…ver.json versions [skip ci]

* ci: add post-publish smoke test for npm package Installs the published package from npm in an isolated directory and verifies it works using mcpc: library imports, server startup, ping, tools-list, and tool calls (search-apify-docs, fetch-apify-docs). Uses --tools docs mode (no APIFY_TOKEN needed). Runs after both beta (on_master) and stable (manual_release_stable) publishes. https://claude.ai/code/session_01W3QpQDWu44fBWmKsd7rA5Q * ci: add temporary debug workflow for smoke test (remove before merge) https://claude.ai/code/session_01W3QpQDWu44fBWmKsd7rA5Q * ci: clarify file deletion reminder https://claude.ai/code/session_01W3QpQDWu44fBWmKsd7rA5Q * ci: simplify smoke test workflow https://claude.ai/code/session_01W3QpQDWu44fBWmKsd7rA5Q * ci: enhance smoke test workflow to support exact version installation * ci: update smoke test to use version 0.9.17 * ci: update retry action to version 4 * ci: reduce timeout and max attempts for package installation; add additional fetch commands --------- Co-authored-by: Claude <noreply@anthropic.com>

…ver.json versions [skip ci]

#671) * fix: Use bin entry point in smoke test instead of direct JS path The smoke test was running `node node_modules/.../dist/stdio.js` directly, bypassing the package's bin entry point. Switch to `npx actors-mcp-server` so we actually test the published package as users would use it. Includes a temporary workflow_dispatch copy for testing in PR. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * chore: trigger test workflow on push (delete later) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: Use environment variable for package version in smoke test workflow * chore: remove temp smoke test --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…ver.json versions [skip ci]

The third test asserted that a stale intermediate statusMessage is preserved through storeTaskResult — which tests InMemoryTaskStore internals, not the bug. The two remaining cases are sufficient regression coverage.

claude · 2026-04-15T13:40:27Z

    bump_dependency_in_internal_repo:
        name: Bump dependency in apify-mcp-server-internal
        needs: [ release_metadata, publish_to_npm ]


🔴 The new smoke_test job was added to gate releases, but bump_dependency_in_internal_repo still has needs: [release_metadata, publish_to_npm], so it runs in parallel with the smoke test rather than after it. If the smoke test fails (broken npm package), the internal-repo dependency bump has already started and will create a PR pinning the broken version. Adding smoke_test to the needs list of bump_dependency_in_internal_repo fixes this.

Extended reasoning...

What the bug is and how it manifests

The PR introduces a new smoke_test job (lines 122-127 in manual_release_stable.yaml) that installs the just-published npm package and runs a series of mcpc commands to verify it is functional. The intent is that this acts as a quality gate before the release is propagated to the internal repository. However, bump_dependency_in_internal_repo (line 129) has the same needs: [release_metadata, publish_to_npm] as smoke_test, so GitHub Actions schedules both jobs to run concurrently as soon as publish_to_npm finishes.

The specific code path that triggers it

In .github/workflows/manual_release_stable.yaml, after publish_to_npm completes, GitHub Actions fans out to any job whose needs list is fully satisfied. Both smoke_test and bump_dependency_in_internal_repo only require [release_metadata, publish_to_npm], so they both start simultaneously. If the smoke test catches a broken release and fails, the bump job is already mid-execution: checking out the internal repo, running npm install, and opening a PR.

Why existing code does not prevent it

There is no needs: smoke_test dependency on bump_dependency_in_internal_repo. GitHub Actions job ordering is determined solely by the needs graph; there is no implicit sequencing. The smoke_test job cannot cancel a sibling job that is already running.

What the impact would be

If a broken version is caught by the smoke test, bump_dependency_in_internal_repo will have already opened a pull request in apify/apify-mcp-server-internal pinning the broken version. Although this is only a PR (not an auto-merge), it creates noise, risks accidental merges by reviewers who trust CI to gate quality, and defeats the entire purpose of adding the smoke test quality gate.

Step-by-step proof

Developer triggers manual_release_stable.yaml with release type patch.

release_metadata runs and outputs version_number = 0.9.19.

publish_to_npm runs and publishes a broken 0.9.19 to npm (e.g., startup crash).

GitHub Actions marks publish_to_npm as succeeded (the publish itself worked).

Both smoke_test AND bump_dependency_in_internal_repo become runnable and start in parallel.

bump_dependency_in_internal_repo checks out apify-mcp-server-internal, installs @apify/actors-mcp-server@0.9.19, and opens a PR titled "chore: bump @apify/actors-mcp-server to 0.9.19".

smoke_test installs 0.9.19, runs npx mcpc against it, gets a crash, and fails — but the PR in the internal repo is already open.

How to fix it

Change the needs list of bump_dependency_in_internal_repo to include smoke_test:

bump_dependency_in_internal_repo: name: Bump dependency in apify-mcp-server-internal needs: [ release_metadata, publish_to_npm, smoke_test ]

This ensures the bump only runs after the smoke test has confirmed the published package is functional. Note that on_master.yaml correctly adds smoke_test as a leaf job with no downstream dependents, so that workflow does not have this issue.

claude · 2026-04-15T13:40:27Z

+                with:
+                    timeout_minutes: 5
+                    max_attempts: 3
+                    retry_wait_seconds: 30
+                    command: |


🟡 The smoke test installs @apify/mcpc without a version pin (npm install "@apify/actors-mcp-server@${PKG_VERSION}" @apify/mcpc), so it always installs the latest published version. If @apify/mcpc releases a breaking change (renamed CLI flags, changed output format), the smoke test would fail even when actors-mcp-server itself is perfectly healthy, creating false CI failures that could block releases. Pinning to a known-good version (e.g. @apify/mcpc@0.2.0 or @apify/mcpc@^0.2.0) would make the test deterministic.

Extended reasoning...

What the bug is and how it manifests

In .github/workflows/_smoke_test_npm_package.yaml (line 35), the install command is:

npm install "@apify/actors-mcp-server@${PKG_VERSION}" @apify/mcpc

@apify/mcpc has no version constraint, so npm always resolves and installs the latest published version. This means the smoke-test environment is non-deterministic: different CI runs can silently pull in different versions of the test harness.

The specific code path that triggers it

The smoke test creates a fresh smoke-test/ directory, runs npm init -y, and installs both packages from scratch. Because @apify/mcpc is not pinned (unlike the main package under test, which is pinned to ${PKG_VERSION}), any new release of @apify/mcpc published between two CI runs will be picked up automatically. The subsequent smoke commands—npx mcpc connect, npx mcpc @dev ping, npx mcpc --json @dev tools-list, etc.—all depend on the CLI API of whichever version was installed.

Why existing code does not prevent it

The project's own package.json pins @apify/mcpc at ^0.2.0 in devDependencies, demonstrating that the team already values version stability for this tool. However, the smoke-test workflow bypasses package.json entirely: it sets up a standalone directory with its own package.json (via npm init -y) and installs packages directly. The retry mechanism (max_attempts: 3) mitigates transient npm registry failures but does nothing about determinism across versions.

What the impact would be

If @apify/mcpc 0.3.0 (hypothetically) renames the connect subcommand, changes the --json flag semantics, or alters the tools-list output format, the smoke test would fail at step 2 or 3—even though @apify/actors-mcp-server itself is perfectly healthy. This would block the beta or stable release pipeline. Because the failure comes from the test harness rather than the package under test, it could be confusing to diagnose, especially in time-sensitive release windows.

How to fix it

Pin @apify/mcpc to the same range already used in devDependencies:

npm install "@apify/actors-mcp-server@${PKG_VERSION}" "@apify/mcpc@^0.2.0"

Or use an exact version for maximum reproducibility:

npm install "@apify/actors-mcp-server@${PKG_VERSION}" "@apify/mcpc@0.2.0"

The refutation argument—that @apify/mcpc is an Apify-owned package and won't introduce uncoordinated breaking changes—is reasonable and lowers the practical risk. However, the same logic would argue against pinning anything, which contradicts the team's existing practice. Even for internal tools, a pinned test harness version is a best-practice that prevents CI surprise failures and makes version-upgrade decisions explicit and intentional.

claude · 2026-04-15T13:40:27Z

+/**
+ * Redact a value for safe logging: shows first 4 and last 4 chars, masks the rest.
+ * Fully masks short values (≤ 8 chars) to prevent reconstruction from the log line.
+ * Returns '(empty)' for empty strings, '(unset)' for undefined/null.
+ */
+function redact(value?: string | null): string {
+    if (value == null) return '(unset)';
+    if (value.length === 0) return '(empty)';
+    if (value.length <= 6) return `*** (${value.length} chars)`;
+    return `${value.slice(0, 3)}***${value.slice(-3)} (${value.length} chars)`;
+}


🟡 The JSDoc for the new redact() function claims it "shows first 4 and last 4 chars" and "Fully masks short values (≤ 8 chars)", but the implementation uses slice(0,3)/slice(-3) (3 chars each side) and masks values with length <= 6. For a security-adjacent logging helper, the docs should accurately reflect how much of a secret is revealed.

Extended reasoning...

What the bug is: The JSDoc comment for the newly-introduced redact() function (evals/shared/config.ts, lines 50–60) contains two factual errors relative to the implementation. (1) The comment says "shows first 4 and last 4 chars" but the code executes value.slice(0, 3) and value.slice(-3), exposing only 3 characters from each end. (2) The comment says "Fully masks short values (≤ 8 chars)" but the guard is value.length <= 6, not <= 8.

The specific code path: The function is called inside sanitizeProcessEnv() to produce redacted log lines such as env PHOENIX_API_KEY: abc***xyz (20 chars). Reviewers reading only the JSDoc would believe secrets shorter than 9 characters are fully masked, and that exactly 4 characters are revealed on each side—neither is accurate.

Why existing code doesn't prevent it: There are no automated tests that assert the format of the redacted output (only that sanitizeProcessEnv modifies process.env correctly). The mismatch slipped through because the function is private and the JSDoc was simply never updated to match the final implementation.

Impact: No functional impact—the actual redaction logic is self-consistent and secrets are still protected. However, the incorrect documentation could mislead a security reviewer auditing information-leakage guarantees of the CI log output. For example: a 7-character API key fragment would expose 6 of its 7 characters (3+3), whereas the JSDoc implies it would be fully masked (≤ 8 chars threshold).

Step-by-step proof: Take an 8-character value like sk-12345 (length = 8). According to the JSDoc it should be fully masked ("≤ 8 chars"). The code checks value.length <= 6—8 > 6, so it falls through to the reveal path and returns sk-***345 (8 chars), exposing 6 of 8 characters. Similarly, for a 10-character value the JSDoc implies 4+4=8 characters are visible, but the code reveals 3+3=6 characters.

How to fix: Correct the JSDoc to match the implementation: change "first 4 and last 4 chars" to "first 3 and last 3 chars" and change "≤ 8 chars" to "≤ 6 chars".

jirispilka · 2026-04-15T14:25:38Z

Closing in favor of #680 — same fix on a clean branch rebased on current master (the old branch had 7 stale commits from an older master state that muddled the diff).

claude and others added 9 commits April 14, 2026 07:25

chore(release): Update changelog, package.json, manifest.json and ser…

29884d2

…ver.json versions [skip ci]

chore(release): Update changelog, package.json, manifest.json and ser…

a9974e9

…ver.json versions [skip ci]

chore(release): Update changelog, package.json, manifest.json and ser…

66f4238

…ver.json versions [skip ci]

chore: Add 't-ai' label to bug report template (#674)

92acf50

claude Bot reviewed Apr 15, 2026

View reviewed changes

jirispilka mentioned this pull request Apr 15, 2026

fix: Set final statusMessage before storeTaskResult (#638) #680

Open

8 tasks

jirispilka closed this Apr 15, 2026

jirispilka deleted the claude/investigate-issue-638-eFPfG branch April 15, 2026 14:32

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: set final statusMessage before storeTaskResult (#638)#677

fix: set final statusMessage before storeTaskResult (#638)#677
jirispilka wants to merge 9 commits intomasterfrom
claude/investigate-issue-638-eFPfG

jirispilka commented Apr 15, 2026 •

edited

Loading

Uh oh!

claude Bot Apr 15, 2026

Uh oh!

claude Bot Apr 15, 2026

Uh oh!

claude Bot Apr 15, 2026

Uh oh!

jirispilka commented Apr 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

jirispilka commented Apr 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Uh oh!

claude Bot Apr 15, 2026

Choose a reason for hiding this comment

Uh oh!

claude Bot Apr 15, 2026

Choose a reason for hiding this comment

Uh oh!

claude Bot Apr 15, 2026

Choose a reason for hiding this comment

Uh oh!

jirispilka commented Apr 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

jirispilka commented Apr 15, 2026 •

edited

Loading