Conversation
📝 WalkthroughWalkthroughThis PR refactors a monolithic Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 12
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
ingest_vector_db.py (1)
12-13:⚠️ Potential issue | 🟠 MajorValidate
batch_sizeas a positive integer.If
batch_sizeis0(or negative), Line 33 fails at runtime. Guard this in both argument parsing andingest(...).🛡️ Suggested fix
def ingest(json_path: str, persist_directory: str, batch_size: int = 100) -> None: + if batch_size <= 0: + raise ValueError("batch_size must be a positive integer") source_path = Path(json_path)parser.add_argument( "--batch-size", type=int, default=100, help="한 번에 처리할 문서의 개수 (기본값: 100)", )+ if args.batch_size <= 0: + raise ValueError("--batch-size는 1 이상의 정수여야 합니다.")Also applies to: 33-33, 56-59
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@ingest_vector_db.py` around lines 12 - 13, The ingest function currently assumes batch_size > 0 causing a runtime failure; add validation in ingest(json_path: str, persist_directory: str, batch_size: int = 100) to check that batch_size is an int > 0 and raise a clear ValueError if not, and also validate the CLI parsing/argparse path (where batch_size is parsed) by supplying a positive-int type or post-parse check to reject 0 or negative values before calling ingest; reference the ingest function and the argument parsing code that constructs batch_size to locate and apply these guards.
🧹 Nitpick comments (1)
app.py (1)
88-93: Narrow the fallback exception scope in retrieval path.Line 90 catches all exceptions, which can hide non-retrieval bugs and silently degrade behavior. Catch expected retrieval failures and only fallback when
base_retrieverexists.♻️ Suggested refinement
- try: - retrieved_docs = st.session_state.rerank_retriever.invoke(query) - except Exception: - st.warning("⚠️ Reranking 중 오류가 발생하여 기본 검색 결과를 사용합니다.") - retrieved_docs = st.session_state.rerank_retriever.base_retriever.invoke(query)[:5] + try: + retrieved_docs = st.session_state.rerank_retriever.invoke(query) + except Exception as e: + base = getattr(st.session_state.rerank_retriever, "base_retriever", None) + if base is None: + raise + st.warning(f"⚠️ Reranking 실패로 기본 검색 결과를 사용합니다. ({e})") + retrieved_docs = base.invoke(query)[:5]🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@app.py` around lines 88 - 93, The try/except around st.session_state.rerank_retriever.invoke(query) is too broad; change it to only catch the specific retrieval-related exceptions thrown by your retriever (e.g., RetrieverError, RetrievalException, or the library-specific exception type) and before falling back ensure st.session_state.rerank_retriever.base_retriever is present; if no base_retriever exists re-raise the exception so unrelated bugs aren't swallowed, and keep the existing user warning/logging when the fallback is used.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@app.py`:
- Around line 123-124: The chat history is always storing final_answer_with_refs
which can be empty even when final_think (or another rendered assistant text)
was shown; change the storage logic to persist the actual rendered assistant
text by computing a rendered_text variable that uses final_answer_with_refs if
non-empty, otherwise falls back to final_think (or final_answer) so the same
text the UI displayed is saved; update any code that appends to history to use
this rendered_text and ensure refs_str/final_answer concatenation still occurs
when refs_str is present.
In `@chain.py`:
- Around line 49-55: The ChatOllama instantiation for llm lacks a request
timeout, which can cause indefinite blocking if the Ollama server is slow;
update the ChatOllama call (the llm variable) to include
client_kwargs={"timeout": 60.0} alongside the existing parameters (model,
temperature, top_p, repeat_penalty, stop) so the underlying HTTP client times
out after 60 seconds.
In `@data_pipeline.py`:
- Line 54: Remove the unnecessary f-string prefix on the print call that has no
interpolations; locate the print statement (print(f"정제 작업이 완료되었습니다!")) in
data_pipeline.py and change it to a normal string literal (remove the leading f)
so lint (F541) is not triggered.
- Around line 50-58: The percent calculation uses initial_count and will raise
on empty input; update the print block after computing final_count and
deleted_count to guard against initial_count == 0 by computing the percent only
when initial_count > 0 (e.g., set percent = deleted_count/initial_count*100 or
"N/A"/0.0 otherwise) and use that percent variable in the f-string that prints
the deletion rate; change references to initial_count, deleted_count, and
output_file in the print statements so they display a safe, explicit value when
the CSV is empty.
- Around line 92-97: The code currently directly accesses row['reference_urls']
when building the dict in processed_data.append which causes a hard failure if
the column is missing; change this to treat reference_urls as optional by
reading it with row.get('reference_urls') (or checking 'reference_urls' in row)
and defaulting to None or an empty list, and only include the key or set the
default value in the dict construction so the pipeline won't break (note
retriever.py lines 16-31 do not consume this field).
In `@README.md`:
- Line 61: Update the awkward Korean sentence in Step 2.1 (the line currently
reading "1,2 뛰어 넘어도 됨.") to the clearer phrasing "1, 2단계를 건너뛰어도 됩니다." so wording
and spacing are correct; edit the README.md Step 2.1 section to replace the old
sentence with the new, properly spaced and polite form.
In `@requirements.txt`:
- Around line 3-15: Update the minimum versions in requirements.txt to exclude
known vulnerable releases: set langchain-core>=0.3.81,
langchain-community>=0.3.27, and bump torch to a recent secure minimum (e.g.,
torch>=2.1.0 or whatever your project CI has validated); keep the other
langchain packages aligned as needed. After updating the three package lower
bounds (langchain-core, langchain-community, torch), add or update a
constraints/lock file (pip-compile/poetry lock/pip freeze) to fully pin
transitive dependencies so vulnerable versions cannot be resolved.
- Line 16: The requirements entry for bitsandbytes must be made
optional/platform-scoped to avoid failing installs on macOS/Apple Silicon;
update the requirements.txt entry for the dependency (the symbol bitsandbytes)
to include a PEP 508 environment marker such as: bitsandbytes>=0.43;
sys_platform == "linux" and platform_machine == "x86_64" (or move it into an
extras group like [extras] or requirements-dev with a similarly guarded marker),
and update the README installation section to instruct Linux x86_64 users to
install the bitsandbytes extra or platform-specific requirements when CUDA is
available.
In `@retriever.py`:
- Around line 34-39: The load_vectorstore function currently lets
OllamaEmbeddings and Chroma initialization errors propagate uncaught; wrap the
creation of OllamaEmbeddings(model=EMBEDDING_MODEL) and the Chroma(...) call in
a try/except that catches Exception, then raise a clear RuntimeError (or custom
exception) that includes context ("failed to initialize embeddings/vectorstore")
and the original exception (e.g., via str(e)) so the app startup handler can
surface actionable feedback; reference the load_vectorstore function and the
OllamaEmbeddings and Chroma initializations when making the changes.
- Around line 63-66: The current non-Qwen branch incorrectly nests the dtype
under an extra "model_kwargs" key so HuggingFaceCrossEncoder doesn't receive
torch_dtype; update the assignment to set model_kwargs as a flat dict
(torch_dtype: torch.float16) before constructing
HuggingFaceCrossEncoder(model_name=RERANKER_MODEL, model_kwargs=model_kwargs)
and keep constructing re_ranker via CrossEncoderReranker(model=model,
top_n=RERANKER_TOP_N).
In `@text_utils.py`:
- Around line 20-27: The current tag parsing is case-sensitive and brittle;
replace the string-split logic with a case-insensitive regex search (using
re.IGNORECASE and re.DOTALL) to extract the content between opening and closing
<think> tags into think_content and, if a closing tag is present, capture the
trailing text, pass it through clean_response for answer_content; also handle
the case where only an opening tag exists by capturing everything after the
opening tag as think_content and returning an empty answer_content, referencing
the variables/text handling in the existing block and the clean_response call to
preserve behavior.
In `@VERSIONS.md`:
- Line 16: The release note line "`**의존성 버전 고정**: `requirements.txt`에 최소 버전 제약을
추가하여 환경 재현성 향상.`" is inaccurate because the file uses minimum bounds rather than
strict pinning; update that heading and/or text in VERSIONS.md to read "`**최소 버전
제약**`" (or similar wording) and adjust the description to state that
`requirements.txt` uses minimum version constraints rather than exact pins so
the wording accurately reflects the dependency policy.
---
Outside diff comments:
In `@ingest_vector_db.py`:
- Around line 12-13: The ingest function currently assumes batch_size > 0
causing a runtime failure; add validation in ingest(json_path: str,
persist_directory: str, batch_size: int = 100) to check that batch_size is an
int > 0 and raise a clear ValueError if not, and also validate the CLI
parsing/argparse path (where batch_size is parsed) by supplying a positive-int
type or post-parse check to reject 0 or negative values before calling ingest;
reference the ingest function and the argument parsing code that constructs
batch_size to locate and apply these guards.
---
Nitpick comments:
In `@app.py`:
- Around line 88-93: The try/except around
st.session_state.rerank_retriever.invoke(query) is too broad; change it to only
catch the specific retrieval-related exceptions thrown by your retriever (e.g.,
RetrieverError, RetrievalException, or the library-specific exception type) and
before falling back ensure st.session_state.rerank_retriever.base_retriever is
present; if no base_retriever exists re-raise the exception so unrelated bugs
aren't swallowed, and keep the existing user warning/logging when the fallback
is used.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: d975d056-e978-4b85-97bf-9c58eb2df229
📒 Files selected for processing (13)
.gitignoreREADME.mdVERSIONS.mdapp.pychain.pyconfig.pydata_cleaning.pydata_pipeline.pydata_preprocessing_for_RAG.pyingest_vector_db.pyrequirements.txtretriever.pytext_utils.py
💤 Files with no reviewable changes (2)
- data_cleaning.py
- data_preprocessing_for_RAG.py
| if refs_str: | ||
| final_answer_with_refs = final_answer + f"\n\n🔗 **참고 자료:**\n{refs_str}" |
There was a problem hiding this comment.
Persist the same assistant text that is rendered.
At Line 131, final_think can be rendered, but Line 136 always stores final_answer_with_refs. If final_answer_with_refs is empty, chat history can lose the shown assistant response on rerender.
🐛 Suggested fix
- if refs_str:
- final_answer_with_refs = final_answer + f"\n\n🔗 **참고 자료:**\n{refs_str}"
- else:
- final_answer_with_refs = final_answer
+ if final_answer and refs_str:
+ final_answer_with_refs = final_answer + f"\n\n🔗 **참고 자료:**\n{refs_str}"
+ else:
+ final_answer_with_refs = final_answer or ""
- if final_answer_with_refs:
- answer_placeholder.markdown(final_answer_with_refs)
- else:
- answer_placeholder.markdown(final_think)
+ display_content = final_answer_with_refs or final_think or ""
+ answer_placeholder.markdown(display_content)
- st.session_state.messages.append({"role": "assistant", "content": final_answer_with_refs})
+ st.session_state.messages.append({"role": "assistant", "content": display_content})Also applies to: 128-136
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@app.py` around lines 123 - 124, The chat history is always storing
final_answer_with_refs which can be empty even when final_think (or another
rendered assistant text) was shown; change the storage logic to persist the
actual rendered assistant text by computing a rendered_text variable that uses
final_answer_with_refs if non-empty, otherwise falls back to final_think (or
final_answer) so the same text the UI displayed is saved; update any code that
appends to history to use this rendered_text and ensure refs_str/final_answer
concatenation still occurs when refs_str is present.
| llm = ChatOllama( | ||
| model=LLM_MODEL, | ||
| temperature=LLM_TEMPERATURE, | ||
| top_p=LLM_TOP_P, | ||
| repeat_penalty=LLM_REPEAT_PENALTY, | ||
| stop=LLM_STOP_TOKENS, | ||
| ) |
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
For langchain-ollama >=0.2, what is the correct timeout parameter for ChatOllama initialization (e.g., timeoutvsrequest_timeout)?
💡 Result:
For langchain-ollama >=0.2, there is no direct timeout or request_timeout parameter for ChatOllama initialization. Official LangChain documentation and reference pages list parameters like model, temperature, num_predict, reasoning, base_url, keep_alive, but no timeout-specific init param. ChatOllama uses the ollama Python client internally, which supports a timeout parameter in Client(host=..., timeout=...). To set timeout, use client_kwargs={"timeout": 60.0} or similar when initializing ChatOllama, as it accepts client_kwargs and async_client_kwargs for passing kwargs to the underlying ollama.Client.
Citations:
- 1: https://reference.langchain.com/python/langchain-ollama/ChatOllama
- 2: https://python.langchain.com/docs/integrations/chat/ollama/
- 3: https://github.com/langchain-ai/langchain/blob/9f232caa7a8fe1ca042a401942d5d90d54ceb1a6/libs/partners/ollama/langchain_ollama/chat_models.py
- 4: https://github.com/langchain-ai/langchain/blob/master/libs/partners/ollama/langchain_ollama/chat_models.py
- 5: https://reference.langchain.com/python/langchain-ollama/chat_models/ChatOllama
- 6: Provide request timeout option ollama/ollama-python#483
- 7: https://github.com/ollama/ollama-python/blob/main/ollama/_client.py
🏁 Script executed:
find . -name "chain.py" -type f | head -5Repository: Gyuho-Han/chatdds
Length of output: 71
🏁 Script executed:
# Look for dependency files to check langchain-ollama version
find . -type f \( -name "requirements*.txt" -o -name "pyproject.toml" -o -name "setup.py" -o -name "Pipfile" \) | head -10Repository: Gyuho-Han/chatdds
Length of output: 79
🏁 Script executed:
# Search for ChatOllama usage in the codebase
rg "ChatOllama" --type py -B 2 -A 8Repository: Gyuho-Han/chatdds
Length of output: 931
🏁 Script executed:
cat requirements.txtRepository: Gyuho-Han/chatdds
Length of output: 378
🏁 Script executed:
cat -n chain.py | sed -n '40,60p'Repository: Gyuho-Han/chatdds
Length of output: 575
🏁 Script executed:
# Check for any existing client_kwargs usage in the codebase
rg "client_kwargs" --type pyRepository: Gyuho-Han/chatdds
Length of output: 43
Add an explicit timeout to prevent indefinite blocking on Ollama requests.
ChatOllama is instantiated without a timeout, so Streamlit requests can hang indefinitely if the local Ollama server becomes slow or unavailable. Use client_kwargs={"timeout": 60.0} to set a timeout on the underlying HTTP client:
llm = ChatOllama(
model=LLM_MODEL,
temperature=LLM_TEMPERATURE,
top_p=LLM_TOP_P,
repeat_penalty=LLM_REPEAT_PENALTY,
stop=LLM_STOP_TOKENS,
client_kwargs={"timeout": 60.0}, # Add this
)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@chain.py` around lines 49 - 55, The ChatOllama instantiation for llm lacks a
request timeout, which can cause indefinite blocking if the Ollama server is
slow; update the ChatOllama call (the llm variable) to include
client_kwargs={"timeout": 60.0} alongside the existing parameters (model,
temperature, top_p, repeat_penalty, stop) so the underlying HTTP client times
out after 60 seconds.
| final_count = len(df) | ||
| deleted_count = initial_count - final_count | ||
|
|
||
| print("-" * 40) | ||
| print(f"정제 작업이 완료되었습니다!") | ||
| print(f"기존 행 개수: {initial_count:,}개") | ||
| print(f"정제 후 행 개수: {final_count:,}개") | ||
| print(f"삭제된 행 개수: {deleted_count:,}개 (약 {deleted_count/initial_count*100:.1f}% 감소)") | ||
| print(f"최종 파일 저장 경로: {os.path.abspath(output_file)}") |
There was a problem hiding this comment.
Guard deletion-rate calculation for empty inputs.
Line 57 divides by initial_count without checking zero, which crashes on empty CSVs.
🐛 Suggested fix
final_count = len(df)
deleted_count = initial_count - final_count
+ deleted_ratio = (deleted_count / initial_count * 100) if initial_count else 0.0
@@
- print(f"삭제된 행 개수: {deleted_count:,}개 (약 {deleted_count/initial_count*100:.1f}% 감소)")
+ print(f"삭제된 행 개수: {deleted_count:,}개 (약 {deleted_ratio:.1f}% 감소)")📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| final_count = len(df) | |
| deleted_count = initial_count - final_count | |
| print("-" * 40) | |
| print(f"정제 작업이 완료되었습니다!") | |
| print(f"기존 행 개수: {initial_count:,}개") | |
| print(f"정제 후 행 개수: {final_count:,}개") | |
| print(f"삭제된 행 개수: {deleted_count:,}개 (약 {deleted_count/initial_count*100:.1f}% 감소)") | |
| print(f"최종 파일 저장 경로: {os.path.abspath(output_file)}") | |
| final_count = len(df) | |
| deleted_count = initial_count - final_count | |
| deleted_ratio = (deleted_count / initial_count * 100) if initial_count else 0.0 | |
| print("-" * 40) | |
| print(f"정제 작업이 완료되었습니다!") | |
| print(f"기존 행 개수: {initial_count:,}개") | |
| print(f"정제 후 행 개수: {final_count:,}개") | |
| print(f"삭제된 행 개수: {deleted_count:,}개 (약 {deleted_ratio:.1f}% 감소)") | |
| print(f"최종 파일 저장 경로: {os.path.abspath(output_file)}") |
🧰 Tools
🪛 Ruff (0.15.10)
[error] 54-54: f-string without any placeholders
Remove extraneous f prefix
(F541)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@data_pipeline.py` around lines 50 - 58, The percent calculation uses
initial_count and will raise on empty input; update the print block after
computing final_count and deleted_count to guard against initial_count == 0 by
computing the percent only when initial_count > 0 (e.g., set percent =
deleted_count/initial_count*100 or "N/A"/0.0 otherwise) and use that percent
variable in the f-string that prints the deletion rate; change references to
initial_count, deleted_count, and output_file in the print statements so they
display a safe, explicit value when the CSV is empty.
| deleted_count = initial_count - final_count | ||
|
|
||
| print("-" * 40) | ||
| print(f"정제 작업이 완료되었습니다!") |
There was a problem hiding this comment.
Remove extraneous f-string prefix.
Line 54 has an f-string without placeholders (F541), which can fail lint if Ruff is enforced.
🧹 Suggested fix
- print(f"정제 작업이 완료되었습니다!")
+ print("정제 작업이 완료되었습니다!")📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| print(f"정제 작업이 완료되었습니다!") | |
| print("정제 작업이 완료되었습니다!") |
🧰 Tools
🪛 Ruff (0.15.10)
[error] 54-54: f-string without any placeholders
Remove extraneous f prefix
(F541)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@data_pipeline.py` at line 54, Remove the unnecessary f-string prefix on the
print call that has no interpolations; locate the print statement (print(f"정제
작업이 완료되었습니다!")) in data_pipeline.py and change it to a normal string literal
(remove the leading f) so lint (F541) is not triggered.
| processed_data.append({ | ||
| "chunk_id": f"doc_{index}_chunk_{i}", | ||
| "title": row['title'], | ||
| "url": row['url'], | ||
| "reference_urls": row['reference_urls'], | ||
| "content_chunk": chunk, |
There was a problem hiding this comment.
Make reference_urls optional to avoid brittle schema coupling.
Line 96 directly accesses row['reference_urls']; missing column causes a hard failure. This field is currently not consumed by retriever.py (retriever.py:16-31), so this should be optional.
♻️ Suggested fix
for i, chunk in enumerate(chunks):
+ reference_urls = row.get("reference_urls", "")
+ if pd.isna(reference_urls):
+ reference_urls = ""
processed_data.append({
"chunk_id": f"doc_{index}_chunk_{i}",
"title": row['title'],
"url": row['url'],
- "reference_urls": row['reference_urls'],
+ "reference_urls": reference_urls,
"content_chunk": chunk,
})🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@data_pipeline.py` around lines 92 - 97, The code currently directly accesses
row['reference_urls'] when building the dict in processed_data.append which
causes a hard failure if the column is missing; change this to treat
reference_urls as optional by reading it with row.get('reference_urls') (or
checking 'reference_urls' in row) and defaulting to None or an empty list, and
only include the key or set the default value in the dict construction so the
pipeline won't break (note retriever.py lines 16-31 do not consume this field).
| sentence-transformers>=3.0 | ||
| transformers>=4.40 | ||
| torch>=2.0 | ||
| bitsandbytes>=0.43 |
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
Is bitsandbytes (>=0.43) officially supported for pip installation on macOS Apple Silicon without CUDA?
💡 Result:
No, bitsandbytes (>=0.43) is not officially supported for pip installation on macOS Apple Silicon without CUDA. While CPU support for macOS arm64 is available via pip wheels starting from recent versions like 0.49.0 (with pre-release wheels explicitly for macOS 14+ arm64), full quantization features (e.g., 4-bit QLoRA, 8-bit optimizers) are limited or not supported on Apple Silicon. Official docs list experimental CPU support for LLM.int8 and QLoRA 4-bit (✅), but 8-bit optimizers (❌) and MPS/Metal support is partial/slow (🐢 for some features, ❌ for others). Prior to 0.49.0, no PyPI wheels were available for >=0.43 on macOS arm64, leading to installation failures. The library is primarily designed for CUDA/NVIDIA GPUs, and Apple Silicon support remains experimental without CUDA.
Citations:
- 1: https://huggingface.co/docs/bitsandbytes/installation
- 2: https://github.com/bitsandbytes-foundation/bitsandbytes/blob/main/docs/source/installation.mdx
- 3: https://github.com/bitsandbytes-foundation/bitsandbytes
- 4: https://github.com/bitsandbytes-foundation/bitsandbytes/releases/tag/continuous-release_main
- 5: https://github.com/bitsandbytes-foundation/bitsandbytes/blob/main/README.md
- 6: Enable publishing of macOS wheel bitsandbytes-foundation/bitsandbytes#1818
- 7: https://github.com/bitsandbytes-foundation/bitsandbytes/releases/tag/0.49.0
🏁 Script executed:
# Check if README exists and read relevant sections mentioning platform/installation
find . -maxdepth 2 -iname "readme*" -type fRepository: Gyuho-Han/chatdds
Length of output: 72
🏁 Script executed:
# Read the full requirements.txt to understand the current dependency structure
cat -n requirements.txtRepository: Gyuho-Han/chatdds
Length of output: 497
🏁 Script executed:
# Check for setup.py, pyproject.toml, setup.cfg to understand optional dependency management
ls -la | grep -E "(setup\.py|pyproject\.toml|setup\.cfg)"Repository: Gyuho-Han/chatdds
Length of output: 43
🏁 Script executed:
# Check for setup.py or pyproject.toml anywhere in the project
find . -type f -name "setup.py" -o -name "pyproject.toml" -o -name "setup.cfg"Repository: Gyuho-Han/chatdds
Length of output: 43
🏁 Script executed:
# Read the README to check for platform-specific installation documentation
head -100 README.mdRepository: Gyuho-Han/chatdds
Length of output: 2567
🏁 Script executed:
# Search for any existing platform-scoped dependencies in the requirements file
rg "platform_system|platform_machine|sys_platform" requirements.txtRepository: Gyuho-Han/chatdds
Length of output: 43
Make bitsandbytes optional/platform-scoped.
This dependency lacks official support for macOS Apple Silicon without CUDA (no PyPI wheels for >=0.43 until v0.49.0). Your README claims optimization for Apple Silicon/CPU environments, but users attempting installation on macOS will encounter failure. Restrict to Linux x86_64 where CUDA support is available.
Suggested change
-bitsandbytes>=0.43
+bitsandbytes>=0.43; platform_system == "Linux" and platform_machine == "x86_64"📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| bitsandbytes>=0.43 | |
| bitsandbytes>=0.43; platform_system == "Linux" and platform_machine == "x86_64" |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@requirements.txt` at line 16, The requirements entry for bitsandbytes must be
made optional/platform-scoped to avoid failing installs on macOS/Apple Silicon;
update the requirements.txt entry for the dependency (the symbol bitsandbytes)
to include a PEP 508 environment marker such as: bitsandbytes>=0.43;
sys_platform == "linux" and platform_machine == "x86_64" (or move it into an
extras group like [extras] or requirements-dev with a similarly guarded marker),
and update the README installation section to instruct Linux x86_64 users to
install the bitsandbytes extra or platform-specific requirements when CUDA is
available.
| def load_vectorstore(persist_directory=CHROMA_DB_DIR): | ||
| """Chroma 벡터DB를 로드합니다.""" | ||
| if not Path(persist_directory).exists(): | ||
| raise FileNotFoundError("벡터DB가 아직 생성되지 않았습니다. 데이터를 먼저 임베딩하세요.") | ||
| embed = OllamaEmbeddings(model=EMBEDDING_MODEL) | ||
| return Chroma(persist_directory=persist_directory, embedding_function=embed) |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Verify caller-side exception scope for startup loading
rg -n -C3 'except\s+FileNotFoundError' --type=py
rg -n -C3 'load_vectorstore\(|get_vectorstore\(' --type=pyRepository: Gyuho-Han/chatdds
Length of output: 1019
Handle vectorstore initialization failures explicitly.
OllamaEmbeddings and Chroma initialization errors bubble up uncaught from this function. The app's startup handler (app.py lines 34-39) only catches FileNotFoundError, so failures such as Ollama service unavailability or embedding model issues cause unhandled exceptions instead of actionable feedback.
Wrap both initialization calls in try-except:
Suggested fix
def load_vectorstore(persist_directory=CHROMA_DB_DIR):
"""Chroma 벡터DB를 로드합니다."""
if not Path(persist_directory).exists():
raise FileNotFoundError("벡터DB가 아직 생성되지 않았습니다. 데이터를 먼저 임베딩하세요.")
- embed = OllamaEmbeddings(model=EMBEDDING_MODEL)
- return Chroma(persist_directory=persist_directory, embedding_function=embed)
+ try:
+ embed = OllamaEmbeddings(model=EMBEDDING_MODEL)
+ return Chroma(persist_directory=persist_directory, embedding_function=embed)
+ except Exception as e:
+ raise RuntimeError(f"벡터DB 로드 실패: {e}") from e🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@retriever.py` around lines 34 - 39, The load_vectorstore function currently
lets OllamaEmbeddings and Chroma initialization errors propagate uncaught; wrap
the creation of OllamaEmbeddings(model=EMBEDDING_MODEL) and the Chroma(...) call
in a try/except that catches Exception, then raise a clear RuntimeError (or
custom exception) that includes context ("failed to initialize
embeddings/vectorstore") and the original exception (e.g., via str(e)) so the
app startup handler can surface actionable feedback; reference the
load_vectorstore function and the OllamaEmbeddings and Chroma initializations
when making the changes.
| model_kwargs = {"model_kwargs": {"torch_dtype": torch.float16}} | ||
|
|
||
| model = HuggingFaceCrossEncoder(model_name=RERANKER_MODEL, model_kwargs=model_kwargs) | ||
| re_ranker = CrossEncoderReranker(model=model, top_n=RERANKER_TOP_N) |
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
In langchain_community.cross_encoders.HuggingFaceCrossEncoder, what exact structure should model_kwargshave for sentence-transformers CrossEncoder (especially for settingtorch_dtype)?
💡 Result:
In langchain_community.cross_encoders.HuggingFaceCrossEncoder, model_kwargs is a dictionary of keyword arguments passed directly to the sentence_transformers.CrossEncoder constructor. The exact structure is a dict[str, Any], typically including keys like 'device' (e.g., 'cpu', 'cuda'), and for setting torch_dtype, use 'torch_dtype' with values such as 'float16', 'bfloat16', 'float32', or 'auto'. Example: model_kwargs = { 'device': 'cuda', 'torch_dtype': 'float16' } hf = HuggingFaceCrossEncoder(model_name="cross-encoder/ms-marco-MiniLM-L-6-v2", model_kwargs=model_kwargs) This passes model_kwargs to sentence_transformers.CrossEncoder(model_name, self.model_kwargs), and Sentence Transformers forwards model_kwargs (including torch_dtype) to the underlying Hugging Face Transformers AutoModel.from_pretrained. Other common keys: 'revision', 'trust_remote_code', 'attn_implementation' (e.g., 'flash_attention_2'), 'device_map' (e.g., 'auto').
Citations:
- 1: https://github.com/langchain-ai/langchain-community/blob/9cb89014e22585573d90fe9bd4af97e680bf164c/libs/community/langchain_community/cross_encoders/huggingface.py
- 2: https://reference.langchain.com/python/langchain-community/cross_encoders/huggingface/HuggingFaceCrossEncoder
- 3: https://reference.langchain.com/v0.3/python/community/cross_encoders/langchain_community.cross_encoders.huggingface.HuggingFaceCrossEncoder.html
- 4: https://www.sbert.net/docs/package_reference/cross_encoder.html
- 5: https://github.com/huggingface/sentence-transformers/blob/main/docs/cross_encoder/usage/efficiency.rst
- 6: https://www.sbert.net/docs/package_reference/cross_encoder/model.html
- 7: https://www.sbert.net/docs/cross_encoder/usage/efficiency.html
🏁 Script executed:
find . -name "retriever.py" -type fRepository: Gyuho-Han/chatdds
Length of output: 75
🏁 Script executed:
cat -n retriever.py | sed -n '50,80p'Repository: Gyuho-Han/chatdds
Length of output: 1417
Remove the extra model_kwargs nesting in the non-Qwen branch.
At line 63, model_kwargs should be a flat dictionary passed directly to HuggingFaceCrossEncoder, not nested under another model_kwargs key. Change:
model_kwargs = {"model_kwargs": {"torch_dtype": torch.float16}}to:
model_kwargs = {"torch_dtype": torch.float16}The extra nesting causes the torch_dtype setting to not be applied, as the CrossEncoder constructor won't recognize the outer model_kwargs key.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@retriever.py` around lines 63 - 66, The current non-Qwen branch incorrectly
nests the dtype under an extra "model_kwargs" key so HuggingFaceCrossEncoder
doesn't receive torch_dtype; update the assignment to set model_kwargs as a flat
dict (torch_dtype: torch.float16) before constructing
HuggingFaceCrossEncoder(model_name=RERANKER_MODEL, model_kwargs=model_kwargs)
and keep constructing re_ranker via CrossEncoderReranker(model=model,
top_n=RERANKER_TOP_N).
| if "<think>" in text and "</think>" in text: | ||
| parts = text.split("</think>", 1) | ||
| think_content = parts[0].split("<think>")[-1].strip() | ||
| answer_content = clean_response(parts[1]) | ||
| return think_content, answer_content | ||
| elif "<think>" in text: | ||
| think_content = text.split("<think>")[-1].strip() | ||
| return think_content, "" |
There was a problem hiding this comment.
<think> parsing is too strict and can miss valid tag variants.
Current split logic is case-sensitive and only matches exact <think>/</think>. Use a case-insensitive regex to make extraction robust.
Suggested fix
def extract_think_and_answer(text: str):
"""문자열에서 <think> 부분과 실제 답변 부분을 완벽하게 분리합니다."""
- if "<think>" in text and "</think>" in text:
- parts = text.split("</think>", 1)
- think_content = parts[0].split("<think>")[-1].strip()
- answer_content = clean_response(parts[1])
- return think_content, answer_content
- elif "<think>" in text:
- think_content = text.split("<think>")[-1].strip()
- return think_content, ""
+ match = re.search(r"<think\b[^>]*>(.*?)</think>", text, flags=re.IGNORECASE | re.DOTALL)
+ if match:
+ think_content = match.group(1).strip()
+ answer_content = clean_response(text[:match.start()] + text[match.end():])
+ return think_content, answer_content
+ elif re.search(r"<think\b[^>]*>", text, flags=re.IGNORECASE):
+ think_content = re.split(r"<think\b[^>]*>", text, flags=re.IGNORECASE)[-1].strip()
+ return think_content, ""
else:
return "", clean_response(text)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@text_utils.py` around lines 20 - 27, The current tag parsing is
case-sensitive and brittle; replace the string-split logic with a
case-insensitive regex search (using re.IGNORECASE and re.DOTALL) to extract the
content between opening and closing <think> tags into think_content and, if a
closing tag is present, capture the trailing text, pass it through
clean_response for answer_content; also handle the case where only an opening
tag exists by capturing everything after the opening tag as think_content and
returning an empty answer_content, referencing the variables/text handling in
the existing block and the clean_response call to preserve behavior.
| - **중복 코드 제거**: `app.py`와 `ingest_vector_db.py`에 중복 존재하던 `load_documents()` 함수를 `retriever.py`로 통합. | ||
| - **데이터 파이프라인 통합**: `data_cleaning.py`와 `data_preprocessing_for_RAG.py`를 `data_pipeline.py`로 통합하여 1,2단계를 한 번에 실행. | ||
| - **Reranker 모델 자동 전환**: `config.py`에서 `RERANKER_MODEL`만 변경하면 BAAI/Qwen 모델에 맞는 kwargs가 자동 적용. | ||
| - **의존성 버전 고정**: `requirements.txt`에 최소 버전 제약을 추가하여 환경 재현성 향상. |
There was a problem hiding this comment.
Release note wording is inaccurate for dependency policy.
requirements.txt currently uses minimum bounds, not strict pinning. Consider changing “의존성 버전 고정” to “최소 버전 제약”.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@VERSIONS.md` at line 16, The release note line "`**의존성 버전 고정**:
`requirements.txt`에 최소 버전 제약을 추가하여 환경 재현성 향상.`" is inaccurate because the file
uses minimum bounds rather than strict pinning; update that heading and/or text
in VERSIONS.md to read "`**최소 버전 제약**`" (or similar wording) and adjust the
description to state that `requirements.txt` uses minimum version constraints
rather than exact pins so the wording accurately reflects the dependency policy.
v0.0.5 업데이트
Summary by CodeRabbit
Release Notes
Bug Fixes
Chores