Added: Dashboard with Vercel config#142
Conversation
caohy1988
left a comment
There was a problem hiding this comment.
Full review on head 4705d2d.
I also prepared a connector fix, but I could not push directly to this PR's source branch: GitHub returned Permission to Saherpathan/BigQuery-Agent-Analytics-SDK.git denied to caohy1988. I pushed the fix commit to my fork instead:
- Branch:
caohy1988:fix/pr142-dashboard-connector - Commit:
05aa9f0 fix(dashboard): harden BigQuery data connector
That patch does four things: supports Vercel service-account env vars and local ADC / GOOGLE_APPLICATION_CREDENTIALS, validates Project/Dataset/Table identifiers before building SQL, normalizes BQ AA rows into the shape the UI expects (id, parent_id, type, label, latency, total_tokens), and fixes the local runbook to use npx vercel dev for /api.
Findings:
-
High —
/apibuilds SQL from unsanitized request headers.
dashboard_v2/api/index.tsinterpolatesx-gcp-project-id,x-bq-dataset, andx-bq-tabledirectly into a backticked table path. A malformed header containing a backtick can break out of the identifier and alter the query. Even if this is intended as BYO table input, the server must validate identifiers before composing SQL. The patch branch adds strict identifier validation and returns400for invalid table refs. -
High — the current BigQuery connector breaks local ADC and some Vercel env setups.
The module eagerly constructsnew BigQuery({ projectId, credentials: { client_email, private_key } })even whenGCP_CLIENT_EMAIL/GCP_PRIVATE_KEYare absent. That prevents the local.envguidance (GOOGLE_APPLICATION_CREDENTIALS=...) from working reliably, because the client is given a partial credentials object instead of falling back to ADC. The patch branch only passes explicit credentials when both service-account env vars are set; otherwise it lets the Google client use ADC /GOOGLE_APPLICATION_CREDENTIALS. -
Medium — the API returns raw BQ AA rows, but the UI expects graph-friendly fields that do not exist in
agent_events.
TraceTreekeys nodes bynode.id || node.trace_id, edges byparent_id, and labels/types bytype/label; BQ AA rows usespan_id,parent_span_id,event_type,agent, JSONcontent, JSONattributes, and JSONlatency_ms. As a result, data can load successfully but render as disconnectednode-Nentries with zero tokens/latency. The patch branch normalizes rows server-side so the existing components can render real BQ AA telemetry. -
Medium — local setup docs say
npm run dev, but that starts only Vite and does not run the Vercel/apifunction.
The README's local path sends users tolocalhost:5173, where/apiis not backed by the serverless function. For the BigQuery connector to run locally, the documented path should benpx vercel dev(usuallylocalhost:3000) or the Vite dev server needs an explicit API proxy/server. The patch branch updates the runbook and keepsnpm run devdocumented as UI-only. -
Medium — the public deployment lets any visitor query any table the deployment service account can read.
The frontend sends Project/Dataset/Table in headers and the backend trusts them. Even after identifier validation, a public Vercel URL plus a broadly-permissioned service account becomes a table browser for every BigQuery table that service account can access. For a public demo, either restrict the allowed project/dataset/table via env allowlist, or make the deployment private/authenticated. This can be follow-up if the service account is intentionally scoped to one demo table only, but the security model should be explicit in README. -
Low —
package.jsonhas a deadstartscript.
"start": "node server.cjs"references a file that does not exist. The patch branch changes it tovite preview --host=0.0.0.0.
Verification on the patch branch:
npm ci-> clean, 0 vulnerabilities.npm run lint-> clean (tsc --noEmit).npm run build-> clean; Vite only emits the existing large-chunk warning for the dashboard bundle.git diff --check-> clean.
|
Thanks for the thorough review and the patch, @caohy1988! I’ve merged your fixes for the SQL sanitization, ADC support, and data normalization into this branch. Much appreciated!
|
|
do you have an e2e demo recorded for this workflow? @Saherpathan |
caohy1988
left a comment
There was a problem hiding this comment.
Fresh review of PR #142 head ff4d4fe against current origin/main.
I found several issues that should block merge.
Blockers
B1 — Public API route lets any browser user query any table the Vercel service account can read
dashboard_v2/api/index.ts takes the table target from browser-controlled headers:
const userProject = getHeader(req, 'x-gcp-project-id');
const userDataset = getHeader(req, 'x-bq-dataset');
const userTable = getHeader(req, 'x-bq-table');Then it runs:
SELECT * FROM `${userProject}.${userDataset}.${userTable}`using the server-side Vercel service account. The regex validation prevents SQL injection, but it does not authorize the table. Any user who can reach a deployed dashboard can change localStorage or send headers and query any table the service account is allowed to read.
This is a confused-deputy problem. The frontend should not choose arbitrary BigQuery table refs for a privileged backend credential.
Please fix with one of these patterns before merge:
- Configure the allowed project/dataset/table server-side via env vars and ignore browser-supplied table refs.
- Or maintain a server-side allowlist of table refs / dataset prefixes.
- Or add real user authentication and authorize the requested table against that user.
For this demo dashboard, the simplest safe shape is probably env-configured table refs plus a timespan filter.
B2 — Data-source inputs do not trigger a reload after the initial failure
The command bar stores Project/Dataset/Table in localStorage:
localStorage.setItem('user_gcp_project', e.target.value);
localStorage.setItem('user_bq_dataset', e.target.value);
localStorage.setItem('user_bq_table', e.target.value);But the components only refetch on filters.timespan / filters.agentId:
AuditLog.tsx:useEffect(..., [filters.timespan])FinOpsSummary.tsx:useEffect(..., [filters.timespan])TraceTree.tsx:useEffect(..., [filters.timespan, filters.agentId])
So a first-time user lands on the dashboard, gets Missing Configuration, enters project/dataset/table, and the dashboard does not refetch unless they change timespan or reload the page.
Please move the data-source values into shared React state / URL state, or add an explicit Connect/Refresh button that triggers the query after the values are entered. As written, the primary configuration flow is broken.
B3 — Agent filter is visible but not wired to the backend query
useDashboardFilters() tracks agent_id, and CommandBar exposes an Agent dropdown, but fetchAgentData() only sends timespan; the API route only filters on timestamp:
WHERE timestamp >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL ...)Selecting an agent changes the URL, but not the BigQuery result set. This is misleading for a dashboard where the first control in the command bar is Agent.
Either wire agent_id through as a validated query parameter and add the corresponding AND agent = @agent_id filter, or remove/disable the Agent filter until it is implemented.
B4 — git diff --check fails
Local check:
git diff --check origin/main...HEADfails with many trailing-whitespace errors, including:
dashboard_v2/api/index.ts:145dashboard_v2/src/App.tsx:14dashboard_v2/src/components/CommandBar.tsx:2dashboard_v2/src/components/AuditLog.tsx:18dashboard_v2/src/components/FinOpsSummary.tsx:20dashboard_v2/src/components/TraceTree.tsx:20dashboard_v2/src/hooks/useDashboardFilters.tsx:28
Please strip trailing whitespace before merge.
Verified locally
After npm ci in dashboard_v2/:
npm run buildpasses.npm run lint(tsc --noEmit) passes.npm audit --omit=devreportsfound 0 vulnerabilities.
GitHub currently only shows cla/google; no dashboard build/typecheck CI is attached to the PR.
The TypeScript app compiles, but the backend authorization and UI data-source flow need fixes before this should merge.
caohy1988
left a comment
There was a problem hiding this comment.
Follow-up to my earlier reviews. Additional blockers I haven't yet seen called out — all checked against pr142-head (ff4d4fe) with npm ci && npm run build locally.
Blockers
B5 — Missing Apache 2.0 copyright headers on every new TS/TSX file
This repo's convention (verified against every Python and shell file under src/ and examples/) is an Apache 2.0 header at the top of every new source file:
# Copyright 2026 Google LLC
#
# Licensed under the Apache License, Version 2.0 (the "License");
...
Every new file in dashboard_v2/ is missing this:
dashboard_v2/api/index.tsdashboard_v2/src/App.tsxdashboard_v2/src/main.tsxdashboard_v2/src/services/apiService.tsdashboard_v2/src/hooks/useDashboardFilters.tsxdashboard_v2/src/components/{AuditLog,CommandBar,FinOpsSummary,TraceTree}.tsxdashboard_v2/src/lib/utils.tsdashboard_v2/vite.config.ts
For a Google-owned repo this is non-negotiable. Add the // Copyright 2026 Google LLC Apache 2.0 header to every new .ts / .tsx file (use /* ... */ block form for TypeScript).
B6 — Latent secret-handling foot-gun: vite.config.ts defines process.env.GEMINI_API_KEY for the client bundle
dashboard_v2/vite.config.ts:9-11:
define: {
'process.env.GEMINI_API_KEY': JSON.stringify(env.GEMINI_API_KEY),
},Vite's define is a build-time text-replace: every occurrence of process.env.GEMINI_API_KEY in client code is baked into the JS bundle as the literal string value.
Empirically the current bundle doesn't leak because (a) loadEnv(mode, '.', '') reads from .env* files, not process.env, (b) no .env is checked in, and (c) no code in src/ references process.env.GEMINI_API_KEY yet. But the wiring is a loaded gun:
- The dashboard already declares
@google/genaiindependencies(production install). It's currently unused, but its presence + thedefinestrongly suggests someone planned client-side Gemini calls. - Any developer who creates a local
.envwithGEMINI_API_KEY=...for testing and runsnpm run buildships a bundle with the literal key in the JS source — readable by every visitor. - Some CI setups write
.envfiles from secrets at build time, triggering the same leak in production.
Fix: Remove the define block entirely. If the dashboard ever needs Gemini access, it must go through the /api serverless route using a server-side env var. Also drop @google/genai from dependencies until a server-side caller actually needs it.
B7 — package.json lists vite in BOTH dependencies and devDependencies
dashboard_v2/package.json:31 and dashboard_v2/package.json:40:
"dependencies": {
...
"vite": "^6.2.3"
},
"devDependencies": {
...
"vite": "^6.2.3"
}vite is a build-time tool and belongs in devDependencies only. The double-listing creates two problems: production installs ship Vite (a ~30 MB bundler) unnecessarily, and npm will silently pick one entry over the other depending on npm version, masking future version drift.
Fix: Remove vite from dependencies.
B8 — Four dependencies declared but never imported
Grepping dashboard_v2/src/ and dashboard_v2/api/ for imports:
| Dep | Where declared | Imports found |
|---|---|---|
express |
dependencies |
none |
@types/express |
devDependencies |
none |
dotenv |
dependencies |
none (Vite uses its own loadEnv; Vercel injects env at runtime) |
@google/genai |
dependencies |
none |
The Vercel API handler at api/index.ts uses the bare (req, res) => void signature, not Express. dotenv is redundant on top of loadEnv. @google/genai is dead pending B6.
Fix: Remove all four. npm run build and npm run lint should still pass.
B9 — /api returns raw BigQuery error messages to the client
dashboard_v2/api/index.ts:170-173:
res.status(500).json({
error: error.message || 'Failed to query BigQuery',
code: error.code,
});BigQuery's error.message can contain project IDs, dataset names, column names, and partial query text from server-side error paths. Even with the identifier validation from your earlier patch, an unauthenticated curl against /api can probe for information by deliberately triggering schema-mismatch errors and reading the responses.
Fix: Return a generic message ({ error: "Query failed", request_id: "<uuid>" }) and console.error(error) server-side. Operators read details in Vercel logs; clients see only the request id for support correlation.
Soft blockers
B10 — No Node engines field in package.json
Vercel's default Node version changes over time; without "engines": { "node": ">=20" } (or similar), the next Vercel default bump could silently change behavior. Pin it.
B11 — tsconfig.json has no include / exclude — client and server share one compilation context
Today, anyone in src/ who writes import { BigQuery } from '@google-cloud/bigquery' will type-check cleanly (because api/ shares the same tsconfig and pulls the type into scope), then break at runtime when Vite bundles it into the browser.
Fix: Either split into tsconfig.client.json (with include: ["src/**/*"]) and tsconfig.server.json (with include: ["api/**/*"]), or add exclude: ["api/**"] to the existing tsconfig and have the Vercel function build use its own.
B12 — No security headers in vercel.json
A public dashboard fetching BigQuery data should set at minimum:
"headers": [{
"source": "/(.*)",
"headers": [
{ "key": "X-Content-Type-Options", "value": "nosniff" },
{ "key": "X-Frame-Options", "value": "DENY" },
{ "key": "Referrer-Policy", "value": "strict-origin-when-cross-origin" }
]
}]Bare minimum defense-in-depth for a publicly-accessible URL.
Verification on pr142-head (ff4d4fe)
npm ci→ clean, 0 vulnerabilities.npm run lint→ clean.npm run build→ clean (the existing 840 KB chunk-size warning).- Empirical leak check: built with
GEMINI_API_KEY=PROBE-LEAK-SENTINEL-12345in shell env; sentinel does not appear indist/assets/*.js(becauseloadEnvreads.env*files, notprocess.env). B6 is a latent foot-gun, not an active leak today.
Summary
Your existing reviews covered the SQL identifier validation, ADC fallback, UI ↔ row-shape mismatch, the dev-runbook drift, the public-deployment authorization model, the data-source refetch flow, the wired-but-not-implemented Agent filter, and the trailing-whitespace lint.
The eight findings above (B5–B12) are all unaddressed by those reviews. B5 (copyright headers) and B6 (secret-handling foot-gun) are the hardest blockers; B7–B9 are package hygiene + info-disclosure; B10–B12 are soft. None of them are caught by npm run lint / npm run build so they have to be raised in review.
Live Dashboard
Known Issue
The live deployment is currently facing a data connection issue with BigQuery, so analytics data may not load correctly on the hosted version. Local setup should work correctly after configuring valid Google Cloud credentials and environment variables.
Tech Stack
React
TypeScript
Vite
Node.js
Google BigQuery
Vercel