Conversation
Braintrust eval reportAutoevals (models-config-1768445476)
|
569d23a to
c3d81a0
Compare
py/autoevals/ragas.py
Outdated
| def _get_ragas_embedding_model(user_model): | ||
| """Get embedding model with RAGAS-specific default fallback. | ||
|
|
||
| Priority: | ||
| 1. Explicitly provided user_model parameter | ||
| 2. User-configured global embedding default (via init()) | ||
| 3. RAGAS-specific default (text-embedding-3-small) | ||
| """ | ||
| if user_model is not None: | ||
| return user_model | ||
|
|
||
| # Check if user has explicitly configured a global embedding default | ||
| configured_default = _default_embedding_model_var.get(None) | ||
| if configured_default is not None: | ||
| return configured_default | ||
|
|
||
| # Fall back to RAGAS-specific default | ||
| return DEFAULT_RAGAS_EMBEDDING_MODEL |
There was a problem hiding this comment.
This exists because (for some reason) Python and TypeScript are inconsistent about the embedding model to use here. Python has its own fallback to text-embedding-3-small while TypeScript delegates to the EmbeddingSimilarity default which will use text-embedding-ada-002. Should we just be switching everywhere to text-embedding-3-small though?
b064549 to
2a3d1c5
Compare
Introduces a new `models` parameter to init() that allows configuring
default models for different evaluation types:
```typescript
init({
models: {
completion: 'claude-3-5-sonnet-20241022',
embedding: 'text-embedding-3-large',
}
})
```
Changes:
- Added `models` parameter to init() in both JS and Python
- Models object supports:
- `completion`: Default model for LLM-as-a-judge evaluations
- `embedding`: Default model for embedding-based evaluations
- `models.completion` takes precedence over deprecated `defaultModel`
- All embedding scorers now use configured default embedding model
- Added getDefaultEmbeddingModel() function
- Maintains backward compatibility with existing `defaultModel` parameter
- Added comprehensive tests for both languages
Default values:
- Completion: "gpt-4o" (unchanged)
- Embedding: "text-embedding-ada-002"
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
| return DEFAULT_RAGAS_MODEL | ||
|
|
||
|
|
||
| DEFAULT_RAGAS_EMBEDDING_MODEL = "text-embedding-3-small" |
There was a problem hiding this comment.
is this important for backwards compat?
There was a problem hiding this comment.
Not sure. I have another comment about that in another part of the PR--the TypeScript code does not do this and, as far as I can tell it's an accident that the Python code differs, but I'm not 100% certain about that or if the model change would constitute a breaking change.
Introduces a new
modelsparameter to init() that allows configuring default models for different evaluation types:Changes:
modelsparameter to init() in both JS and Pythoncompletion: Default model for LLM-as-a-judge evaluationsembedding: Default model for embedding-based evaluationsmodels.completiontakes precedence over deprecateddefaultModeldefaultModelparameterDefault values: