[ai-assisted] feat(rag): embedding profile 선택 계약 보강#337
Open
[ai-assisted] feat(rag): embedding profile 선택 계약 보강#337
Conversation
Issue: - #336 Why: - RAG 색인/검색에서 embedding provider/model/profile을 선택하고 구조화 chunk input type metadata를 보존해야 한다. What: - core RAG/embedding request 계약에 embedding profile/provider/model/input type 필드를 하위 호환 방식으로 추가했다. - starter-ai에 RAG embedding profile resolver와 pipeline wiring을 추가하고 VectorRecord metadata 및 pgvector metadata filter pushdown을 보강했다. - starter-ai-web과 content-embedding-pipeline DTO/controller mapping을 확장하고 attachment structured/fallback 색인 경로의 embedding 선택을 통일했다. - table/image-caption/OCR chunk input type metadata와 관련 테스트/문서를 추가했다. Validation: - ./gradlew :studio-platform-ai:test :starter:studio-platform-starter-ai:test :starter:studio-platform-starter-ai-web:test :starter:studio-platform-starter-chunking:test :studio-application-modules:content-embedding-pipeline:test : PASS - git diff --check : PASS - Subagent review: PASS after fixes
Issue: - #336 Why: - 서브에이전트 리뷰에서 Spring AI embedding model 선택 검증, raw vector search selection 검증, legacy default profile 검색 호환성, image/OCR input type 테스트 보강 필요가 확인됐다. What: - SpringAiEmbeddingAdapter가 request/profile model과 실제 구성 model 불일치를 거부하도록 보강했다. - /vectors/search precomputed embedding 경로도 explicit embedding selection을 resolver로 검증하고 resolved metadata filter를 적용하도록 보강했다. - minimal legacy RAG 검색에는 default profile metadata filter를 강제하지 않아 기존 metadata 없는 chunk 조회 호환성을 유지했다. - IMAGE_CAPTION/OCR_TEXT input type 기록 테스트와 관련 문서를 보강했다. Validation: - ./gradlew :starter:studio-platform-starter-ai:test --tests 'studio.one.platform.ai.autoconfigure.adapter.SpringAiEmbeddingAdapterTest' --tests 'studio.one.platform.ai.service.pipeline.RagPipelineServiceTest' :starter:studio-platform-starter-ai-web:test --tests 'studio.one.platform.ai.web.controller.VectorControllerTest' : PASS - ./gradlew :studio-platform-ai:test :starter:studio-platform-starter-ai:test :starter:studio-platform-starter-ai-web:test :starter:studio-platform-starter-chunking:test :studio-application-modules:content-embedding-pipeline:test : PASS - git diff --check : PASS - Subagent review: 보완 완료
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Why
What
EmbeddingRequest,RagIndexRequest,RagSearchRequest,RagIndexJobSourceRequest에 embedding 선택 필드를 하위 호환 방식으로 추가했습니다.RagEmbeddingProfileResolver와 starter-ai profile 설정을 추가하고, RAG 색인/검색 pipeline에 provider/model/profile 선택을 연결했습니다.VectorRecordmetadata에embeddingProvider,embeddingModel,embeddingDimension,embeddingProfileId,embeddingInputType을 기록하고 pgvector 검색 metadata filter pushdown을 보강했습니다.TABLE_TEXT,IMAGE_CAPTION,OCR_TEXT로 기록하고 request-level embedding field 우선순위를 통일했습니다.Related Issues
Validation
./gradlew :studio-platform-ai:test :starter:studio-platform-starter-ai:test :starter:studio-platform-starter-ai-web:test :starter:studio-platform-starter-chunking:test :studio-application-modules:content-embedding-pipeline:testgit diff --checkRisk / Rollback
AI / Subagent Usage
git diff --check실행Checklist
AI-Assistedvalue is correct