Add models configuration object to init() by Qard · Pull Request #164 · braintrustdata/autoevals

Qard · 2026-01-14T01:26:15Z

Introduces a new models parameter to init() that allows configuring default models for different evaluation types:

init({
  models: {
    completion: 'claude-3-5-sonnet-20241022',
    embedding: 'text-embedding-3-large',
  }
})

Changes:

Added models parameter to init() in both JS and Python
Models object supports:
- completion: Default model for LLM-as-a-judge evaluations
- embedding: Default model for embedding-based evaluations
models.completion takes precedence over deprecated defaultModel
All embedding scorers now use configured default embedding model
Added getDefaultEmbeddingModel() function
Maintains backward compatibility with existing defaultModel parameter
Added comprehensive tests for both languages

Default values:

Completion: "gpt-4o" (unchanged)
Embedding: "text-embedding-ada-002"

github-actions · 2026-01-14T01:48:44Z

Braintrust eval report

Autoevals (models-config-1768445476)

Score	Average	Improvements	Regressions
NumericDiff	72.5% (+0pp)	4 🟢	3 🔴
Time_to_first_token	1.35tok (-0.02tok)	69 🟢	49 🔴
Llm_calls	1.55 (+0)	-	-
Tool_calls	0 (+0)	-	-
Errors	0 (+0)	-	-
Llm_errors	0 (+0)	-	-
Tool_errors	0 (+0)	-	-
Prompt_tokens	279.25tok (+0tok)	-	-
Prompt_cached_tokens	0tok (+0tok)	-	-
Prompt_cache_creation_tokens	0tok (+0tok)	-	-
Completion_tokens	19.3tok (+0tok)	-	-
Completion_reasoning_tokens	0tok (+0tok)	-	-
Total_tokens	298.54tok (+0tok)	-	-
Estimated_cost	0$ (+0$)	-	-
Duration	2.89s (+0.15s)	103 🟢	116 🔴
Llm_duration	2.66s (-0.16s)	82 🟢	37 🔴

js/oai.ts

py/autoevals/oai.py

js/oai.ts

Qard · 2026-01-15T00:45:30Z

py/autoevals/ragas.py

+def _get_ragas_embedding_model(user_model):
+    """Get embedding model with RAGAS-specific default fallback.
+
+    Priority:
+    1. Explicitly provided user_model parameter
+    2. User-configured global embedding default (via init())
+    3. RAGAS-specific default (text-embedding-3-small)
+    """
+    if user_model is not None:
+        return user_model
+
+    # Check if user has explicitly configured a global embedding default
+    configured_default = _default_embedding_model_var.get(None)
+    if configured_default is not None:
+        return configured_default
+
+    # Fall back to RAGAS-specific default
+    return DEFAULT_RAGAS_EMBEDDING_MODEL


This exists because (for some reason) Python and TypeScript are inconsistent about the embedding model to use here. Python has its own fallback to text-embedding-3-small while TypeScript delegates to the EmbeddingSimilarity default which will use text-embedding-ada-002. Should we just be switching everywhere to text-embedding-3-small though?

Introduces a new `models` parameter to init() that allows configuring default models for different evaluation types: ```typescript init({ models: { completion: 'claude-3-5-sonnet-20241022', embedding: 'text-embedding-3-large', } }) ``` Changes: - Added `models` parameter to init() in both JS and Python - Models object supports: - `completion`: Default model for LLM-as-a-judge evaluations - `embedding`: Default model for embedding-based evaluations - `models.completion` takes precedence over deprecated `defaultModel` - All embedding scorers now use configured default embedding model - Added getDefaultEmbeddingModel() function - Maintains backward compatibility with existing `defaultModel` parameter - Added comprehensive tests for both languages Default values: - Completion: "gpt-4o" (unchanged) - Embedding: "text-embedding-ada-002" Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

clutchski · 2026-01-27T15:24:08Z

py/autoevals/ragas.py

    return DEFAULT_RAGAS_MODEL


-DEFAULT_RAGAS_EMBEDDING_MODEL = "text-embedding-3-small"


is this important for backwards compat?

Not sure. I have another comment about that in another part of the PR--the TypeScript code does not do this and, as far as I can tell it's an accident that the Python code differs, but I'm not 100% certain about that or if the model change would constitute a breaking change.

Qard requested a review from ibolmo January 14, 2026 01:26

Qard self-assigned this Jan 14, 2026

Qard added the enhancement New feature or request label Jan 14, 2026

Qard force-pushed the models-config branch from 19f4208 to 96f6479 Compare January 14, 2026 02:01

ibolmo approved these changes Jan 14, 2026

View reviewed changes

js/oai.ts Outdated Show resolved Hide resolved

py/autoevals/oai.py Outdated Show resolved Hide resolved

clutchski reviewed Jan 14, 2026

View reviewed changes

js/oai.ts Outdated Show resolved Hide resolved

Qard force-pushed the models-config branch 2 times, most recently from 569d23a to c3d81a0 Compare January 15, 2026 00:23

Qard commented Jan 15, 2026

View reviewed changes

Qard force-pushed the models-config branch 2 times, most recently from b064549 to 2a3d1c5 Compare January 15, 2026 02:48

Qard force-pushed the models-config branch from 2a3d1c5 to 61177b7 Compare January 15, 2026 02:50

Qard requested review from clutchski and ibolmo January 22, 2026 21:13

clutchski reviewed Jan 27, 2026

View reviewed changes

clutchski approved these changes Jan 27, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add models configuration object to init()#164

Add models configuration object to init()#164
Qard wants to merge 1 commit intomainfrom
models-config

Qard commented Jan 14, 2026

Uh oh!

github-actions bot commented Jan 14, 2026 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Qard Jan 15, 2026

Uh oh!

clutchski Jan 27, 2026

Uh oh!

Qard Jan 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

		return DEFAULT_RAGAS_MODEL


		DEFAULT_RAGAS_EMBEDDING_MODEL = "text-embedding-3-small"

Conversation

Qard commented Jan 14, 2026

Uh oh!

github-actions bot commented Jan 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Braintrust eval report

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Qard Jan 15, 2026

Choose a reason for hiding this comment

Uh oh!

clutchski Jan 27, 2026

Choose a reason for hiding this comment

Uh oh!

Qard Jan 28, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

github-actions bot commented Jan 14, 2026 •

edited

Loading