Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
148 changes: 117 additions & 31 deletions src/routes/model/$modelId.tsx
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@ import {
Gauge,
Info,
Layers,
Wrench,
} from "lucide-react";
import { useCallback, useEffect, useMemo } from "react";
import CategoryTabs from "#/components/CategoryTabs";
Expand Down Expand Up @@ -130,69 +131,154 @@ interface StatCard {
}

function ModelStats({ model }: { model: ModelResult }) {
const items: StatCard[] = [];

const totalItems: StatCard[] = [];
const avgItems: StatCard[] = [];

// Compute average tool calls
const totalToolCalls =
model.questionDetails?.reduce(
(sum, q) => sum + (q.toolCallCount ?? 0),
0,
) ?? 0;
const avgToolCalls =
model.totalQuestions > 0 ? totalToolCalls / model.totalQuestions : 0;

Comment on lines +137 to +145
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Don't hide valid zero tool-call metrics.

QuestionDetail.toolCallCount is optional, and QuestionCard already treats != null as the “data exists” check. Here, ?? 0 plus the > 0 guards makes “no tool-call data” and “measured zero usage” look identical, and toFixed(1) can still show a non-zero average as 0.0. Track presence separately and render zero/small values when the metric is known.

💡 One way to preserve zero values without treating missing data as zero
+ const hasToolCallData =
+   model.questionDetails?.some((q) => q.toolCallCount != null) ?? false;
  const totalToolCalls =
    model.questionDetails?.reduce(
      (sum, q) => sum + (q.toolCallCount ?? 0),
      0,
    ) ?? 0;
  const avgToolCalls =
-   model.totalQuestions > 0 ? totalToolCalls / model.totalQuestions : 0;
+   hasToolCallData && model.totalQuestions > 0
+     ? totalToolCalls / model.totalQuestions
+     : 0;
...
- if (totalToolCalls > 0) {
+ if (hasToolCallData) {
    totalItems.push({
      icon: Wrench,
      label: "Total tool calls",
      value: totalToolCalls.toLocaleString(),
    });
  }
...
- if (avgToolCalls > 0) {
+ if (hasToolCallData && model.totalQuestions > 0) {
    avgItems.push({
      icon: Wrench,
      label: "Avg tool calls/question",
-     value: avgToolCalls.toFixed(1),
+     value:
+       avgToolCalls === 0
+         ? "0.0"
+         : avgToolCalls < 0.01
+           ? "<0.01"
+           : avgToolCalls < 0.1
+             ? avgToolCalls.toFixed(2)
+             : avgToolCalls.toFixed(1),
    });
  }

Also applies to: 174-179, 223-227

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/routes/model/`$modelId.tsx around lines 137 - 145, Compute averages using
only entries where QuestionDetail.toolCallCount is present rather than treating
missing as zero: replace the current sum+null-coalescing logic by (a)
totalToolCalls = sum of q.toolCallCount for q where q.toolCallCount != null, (b)
toolCallCountKnown = count of q where q.toolCallCount != null, and (c)
avgToolCalls = toolCallCountKnown > 0 ? totalToolCalls / toolCallCountKnown :
undefined (or null) so you can render known zeros distinctly; update the
rendering to check toolCallCountKnown (or avg != null) before calling toFixed to
avoid hiding valid zero values. Do the same fix for the other average
computations referenced around the blocks using model.questionDetails at
avgToolCalls and the similar calculations at the later occurrences (the blocks
you noted at 174-179 and 223-227).

// Total stats
if (model.totalTokens > 0) {
const parts: string[] = [];
if (model.totalPromptTokens > 0)
parts.push(`Input: ${model.totalPromptTokens.toLocaleString()}`);
if (model.totalCompletionTokens > 0)
parts.push(`Output: ${model.totalCompletionTokens.toLocaleString()}`);
items.push({
totalItems.push({
icon: Layers,
label: "Total tokens",
value: model.totalTokens.toLocaleString(),
tooltip: parts.length > 0 ? parts.join("\n") : undefined,
});
}
if (model.totalDurationMs > 0) {
items.push({
totalItems.push({
icon: Clock,
label: "Total duration",
value: formatDuration(model.totalDurationMs),
});
}
if (model.totalCost > 0) {
totalItems.push({
icon: Coins,
label: "Total cost",
value: `$${model.totalCost.toFixed(4)}`,
});
}
if (totalToolCalls > 0) {
totalItems.push({
icon: Wrench,
label: "Total tool calls",
value: totalToolCalls.toLocaleString(),
});
}

// Average stats per question
if (model.totalTokens > 0 && model.totalQuestions > 0) {
const avgTokens = Math.round(model.totalTokens / model.totalQuestions);
const avgPrompt = Math.round(
model.totalPromptTokens / model.totalQuestions,
);
const avgCompletion = Math.round(
model.totalCompletionTokens / model.totalQuestions,
);
avgItems.push({
icon: Layers,
label: "Avg tokens/question",
value: avgTokens.toLocaleString(),
tooltip: `Input: ${avgPrompt.toLocaleString()}\nOutput: ${avgCompletion.toLocaleString()}`,
});
}
if (model.totalDurationMs > 0 && model.totalQuestions > 0) {
const avgDurationMs = Math.round(
model.totalDurationMs / model.totalQuestions,
);
avgItems.push({
icon: Clock,
label: "Avg duration/question",
value: formatDuration(avgDurationMs),
});
}
if (model.averageTokensPerSecond > 0) {
items.push({
avgItems.push({
icon: Gauge,
label: "Avg speed",
value: `${model.averageTokensPerSecond.toFixed(1)} tok/s`,
});
}
if (model.totalCost > 0) {
items.push({
if (model.totalCost > 0 && model.totalQuestions > 0) {
const avgCost = model.totalCost / model.totalQuestions;
avgItems.push({
icon: Coins,
label: "Total cost",
value: `$ ${model.totalCost.toFixed(4)}`,
label: "Avg cost/question",
value: avgCost < 0.0001 ? "<$0.0001" : `$${avgCost.toFixed(4)}`,
});
}
if (avgToolCalls > 0) {
avgItems.push({
icon: Wrench,
label: "Avg tool calls/question",
value: avgToolCalls.toFixed(1),
});
}

if (items.length === 0) return null;
if (totalItems.length === 0 && avgItems.length === 0) return null;

const renderCard = (item: StatCard) => (
<div key={item.label} className="arena-card flex flex-col gap-1.5 p-3">
<div className="flex items-center gap-1.5 text-[var(--text-secondary)]">
<item.icon size={12} />
<span className="text-xs">{item.label}</span>
</div>
<span className="inline-flex items-center gap-1.5 text-sm font-semibold text-[var(--text-primary)]">
{item.value}
{item.tooltip && (
<button
type="button"
className="group/tip relative inline-flex cursor-help border-none bg-transparent p-0"
aria-label={`Show details: ${item.tooltip.replace("\n", ", ")}`}
>
<Info
size={12}
className="opacity-40 transition-opacity group-hover/tip:opacity-70 group-focus/tip:opacity-70"
/>
<span className="pointer-events-none absolute bottom-full left-1/2 z-50 mb-1.5 -translate-x-1/2 whitespace-pre rounded-md bg-[#1e1e22] px-2.5 py-1.5 text-xs font-normal text-[#EDEDF0] opacity-0 shadow-lg ring-1 ring-white/10 transition-opacity group-hover/tip:opacity-100 group-focus/tip:opacity-100">
{item.tooltip}
</span>
</button>
)}
</span>
</div>
);

return (
<div className="mb-8 grid grid-cols-2 gap-2 sm:grid-cols-4">
{items.map((item) => (
<div key={item.label} className="arena-card flex flex-col gap-1.5 p-3">
<div className="flex items-center gap-1.5 text-[var(--text-secondary)]">
<item.icon size={12} />
<span className="text-xs">{item.label}</span>
<div className="mb-8 space-y-4">
{totalItems.length > 0 && (
<div>
<h3 className="mb-2 text-xs font-medium text-[var(--text-secondary)]">
Totals
</h3>
<div className="grid grid-cols-2 gap-2 sm:grid-cols-4">
{totalItems.map(renderCard)}
</div>
</div>
)}
{avgItems.length > 0 && (
<div>
<h3 className="mb-2 text-xs font-medium text-[var(--text-secondary)]">
Averages
</h3>
<div className="grid grid-cols-2 gap-2 sm:grid-cols-4">
{avgItems.map(renderCard)}
</div>
<span className="inline-flex items-center gap-1.5 text-sm font-semibold text-[var(--text-primary)]">
{item.value}
{item.tooltip && (
<span className="group/tip relative">
<Info
size={12}
className="opacity-40 group-hover/tip:opacity-70 transition-opacity cursor-help"
/>
<span className="pointer-events-none absolute bottom-full left-1/2 z-50 mb-1.5 -translate-x-1/2 whitespace-pre rounded-md bg-[#1e1e22] px-2.5 py-1.5 text-xs font-normal text-[#EDEDF0] opacity-0 shadow-lg ring-1 ring-white/10 transition-opacity group-hover/tip:opacity-100">
{item.tooltip}
</span>
</span>
)}
</span>
</div>
))}
)}
</div>
);
}
Expand Down