Conversation
Co-authored-by: JerryNixon <1749983+JerryNixon@users.noreply.github.com>
Co-authored-by: JerryNixon <1749983+JerryNixon@users.noreply.github.com>
…configuration Co-authored-by: JerryNixon <1749983+JerryNixon@users.noreply.github.com>
Co-authored-by: JerryNixon <1749983+JerryNixon@users.noreply.github.com>
Co-authored-by: JerryNixon <1749983+JerryNixon@users.noreply.github.com>
Co-authored-by: JerryNixon <1749983+JerryNixon@users.noreply.github.com>
Co-authored-by: JerryNixon <1749983+JerryNixon@users.noreply.github.com>
Co-authored-by: JerryNixon <1749983+JerryNixon@users.noreply.github.com>
Co-authored-by: JerryNixon <1749983+JerryNixon@users.noreply.github.com>
… and telemetry integration Co-authored-by: JerryNixon <1749983+JerryNixon@users.noreply.github.com>
Co-authored-by: JerryNixon <1749983+JerryNixon@users.noreply.github.com>
…ate empty embeddings Co-authored-by: JerryNixon <1749983+JerryNixon@users.noreply.github.com>
…oint/health sub-objects Co-authored-by: JerryNixon <1749983+JerryNixon@users.noreply.github.com>
…orization Co-authored-by: JerryNixon <1749983+JerryNixon@users.noreply.github.com>
Co-authored-by: JerryNixon <1749983+JerryNixon@users.noreply.github.com>
…lthCheckConfig in converter
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
…dding controller tests, switching the dab schema for the embedding system to default to false
embeddings endpoint is now permanently fixed to /embed with no user-configurable path option. This removes unnecessary configuration surface since the feature has not been released yet, eliminating the need for backward compatibility. Changes: - Remove path property from dab.draft.schema.json - Remove Path, UserProvidedPath, and EffectivePath from EmbeddingsEndpointOptions - Remove EffectiveEndpointPath from EmbeddingsOptions - Remove path deserialization from EmbeddingsOptionsConverterFactory - Remove --runtime.embeddings.endpoint.path CLI option - Remove path configuration logic from ConfigGenerator - Remove endpoint path validation from RuntimeConfigValidator - Update Startup.cs logging to use DEFAULT_PATH constant - Update all tests to remove path references
This reverts commit a3f0b1d.
| // For batch, check cache for each text individually | ||
| string[] cacheKeys = texts.Select(CreateCacheKey).ToArray(); | ||
| float[]?[] results = new float[texts.Length][]; | ||
| List<int> uncachedIndices = new(); |
There was a problem hiding this comment.
Would it be better to use a Dictionary in case there are some duplicate keys to avoid duplicate calls to the api?
There was a problem hiding this comment.
Can you elaborate? We do a GET on each chunk
| { | ||
| Stopwatch stopwatch = new(); | ||
| stopwatch.Start(); | ||
| EmbeddingResult result = await _embeddingService.TryEmbedAsync(testText); |
There was a problem hiding this comment.
I wonder if it would be useful to separate out errors from the cache from errors calling the underlying embedding api.. eg you might have cache hits for the testText but be unable to try getting embeddings for new strings from the service, or you might be able to get new embeddings but not able to cache them causing perf issues
There was a problem hiding this comment.
Can you revisit this again with the latest iteration?
souvikghosh04
left a comment
There was a problem hiding this comment.
Posting comments so far. My review is still in progress. also, waiting for existing review comments to be addressed.
|
|
||
| if (!result.Success) | ||
| { | ||
| errorMessage = result.ErrorMessage ?? "Embedding request failed."; |
There was a problem hiding this comment.
may add appropriate, exponential retries upto 3?
There was a problem hiding this comment.
Good point, the reason to avoid doing retry is to avoid incurring additional latencies especially when we expect embedding creation as a precursor to SQL query.
|
|
||
| int responseTimeMs = (int)stopwatch.ElapsedMilliseconds; | ||
| bool isResponseTimeWithinThreshold = responseTimeMs <= thresholdMs; | ||
| bool isDimensionsValid = true; |
There was a problem hiding this comment.
should be false by default?
There was a problem hiding this comment.
Validation logic verifies dimension of the embeddings returned by the API, so it doesn't really matter.
souvikghosh04
left a comment
There was a problem hiding this comment.
posting additional comments. there are several inconsistencies between the schema JSON and the internal C# code files, including tests. e.g. threshold ms, API version, roles etc. are few to name which differes in schema JSON and the internal C# files. I will wait for these to get addressed, including pending comments.
| ""runtime"": { | ||
| ""embeddings"": { | ||
| ""provider"": ""azure-openai"", | ||
| ""base-url"": ""https://my-openai.openai.azure.com"", |
There was a problem hiding this comment.
urls and other configurations, including repeated code should be put in a common place. assertions in below tests also has similar hardcoded values. keep a single source of truth and reference.
I have revisited the config defaults and made them consistent across repo. |
Summary
This PR adds configurable text chunking capabilities to the embeddings API, enabling automatic text segmentation before embedding generation. This feature supports both single-text and multi-document batch processing with runtime configuration and query parameter overrides.
Changes
Configuration
Added EmbeddingsChunkingOptions.cs - Configuration model for chunking behavior
Enabled (bool) - Enable/disable chunking
SizeChars (int) - Chunk size in characters (default: 1000)
OverlapChars (int) - Overlap between chunks (default: 250)
EffectiveSizeChars property ensures minimum valid chunk size
Modified EmbeddingsOptions.cs - Added Chunking property and IsChunkingEnabled helper
Removed EmbeddingsCacheOptions.cs - Simplified configuration by removing unused cache feature
API Enhancements
Modified Controllers/EmbeddingController.cs
Auto-detects request type (single text vs. document array)
Implements overlapping text chunking algorithm
Supports query parameter overrides: $chunking.enabled, $chunking.size-chars, $chunking.overlap-chars
Returns multiple embeddings per document when chunking is enabled
Added Models/EmbedDocumentRequest.cs - Request model for document arrays
Added Models/EmbedDocumentResponse.cs - Response model with chunked embeddings
Schema (schemas)
Modified dab.draft.schema.json - Added chunking configuration schema with validation rules
Testing (UnitTests)
Added EmbeddingsChunkingOptionsTests.cs (13 tests) - Configuration validation
Added ChunkTextTests.cs (21 tests) - Chunking algorithm validation including edge cases
Modified EmbeddingControllerTests.cs (+18 tests) - API endpoint tests for chunking and document arrays
Total Test Coverage: 72 tests (48 existing + 24 new) - All passing## Why make this change?
Testing
All 72 unit tests passing
Edge cases covered: empty text, very small chunks, overlap larger than chunk size, Unicode text
Query parameter parsing validated
Backward compatibility verified
Breaking Changes
None - This is a backward-compatible addition. Existing single-text requests continue to work without modification.