Skip to content

Fix flake8 F824 in api.py; add FastAPI backend, Streamlit UI, tests, and CI#3

Merged
joshuvavinith merged 5 commits intomainfrom
copilot/fix-failed-tasks
Mar 7, 2026
Merged

Fix flake8 F824 in api.py; add FastAPI backend, Streamlit UI, tests, and CI#3
joshuvavinith merged 5 commits intomainfrom
copilot/fix-failed-tasks

Conversation

Copy link
Contributor

Copilot AI commented Mar 6, 2026

  • Identified CI failure: flake8 F824 error in api.py - unused global _default_bot statement
  • Updated ai_chatbot.py with refactored version (LLM + pattern-matching backends, ChatBot facade, GUI only runs when executed directly)
  • Added api.py with the flake8 bug fixed (removed unused global _default_bot)
  • Added test_chatbot.py with unit/integration tests
  • Added web_demo.py Streamlit web interface
  • Added Dockerfile for web and API modes
  • Added .github/workflows/ci.yml CI workflow
  • Resolved conflicts with main: updated ci.yml (lint/test matrix/docker jobs), README.md (rewritten), requirements.txt (added pytest-cov)
  • Fixed api.py: restored _default_dialog_file module var (reverted by merge commit) so /train persists training data for new sessions; _get_or_create_session now uses force_offline=True when a dialog file is set to prevent Kaggle fallback in CI; fixed error message; updated docstring
  • Fixed ai_chatbot.py: _parse_kaggle_dialogs uses csv.writer; fixed E302 blank line
  • Fixed web_demo.py: removed @st.cache_resource factory (history leak); removed unused import os
  • Fixed test_chatbot.py: removed unused top-level TestClient import; test_train_affects_new_chat_session verifies /train persistence
  • All 20 tests pass; flake8 0 errors; CodeQL 0 alerts

💬 We'd love your input! Share your thoughts on Copilot coding agent in our 2 minute survey.

… api.py

Co-authored-by: joshuvavinith <146979257+joshuvavinith@users.noreply.github.com>
Copilot AI changed the title [WIP] Fix failures in task execution Fix flake8 F824 in api.py; add FastAPI backend, Streamlit UI, tests, and CI Mar 6, 2026
@joshuvavinith joshuvavinith requested a review from Copilot March 7, 2026 07:19
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR expands the project from a Tkinter-only chatbot into a multi-surface app (FastAPI backend + Streamlit UI) while addressing a flake8 F824 CI failure and adding automated tests + CI.

Changes:

  • Introduces a FastAPI service with session-based chat, health, training, and session reset endpoints.
  • Refactors chatbot logic into SimpleBot (offline) + LLMBot (OpenAI) behind a ChatBot facade; isolates Tkinter GUI behind __main__.
  • Adds a Streamlit demo UI, pytest coverage, Docker image modes, and a GitHub Actions CI pipeline.

Reviewed changes

Copilot reviewed 6 out of 8 changed files in this pull request and generated 8 comments.

Show a summary per file
File Description
ai_chatbot.py Splits core chatbot into offline + LLM backends with a unified facade; keeps GUI runnable via __main__.
api.py Adds FastAPI REST API with in-memory session storage and retrain endpoint.
web_demo.py Adds Streamlit chat UI with session state and optional LLM streaming.
test_chatbot.py Adds pytest suite for SimpleBot, ChatBot, and API endpoints via TestClient.
requirements.txt Adds dependencies for FastAPI, Streamlit, OpenAI, dotenv, and testing.
Dockerfile Adds container build with runtime switch between API and Streamlit modes.
.github/workflows/ci.yml Adds lint → test → docker build workflow.
.gitignore Adds common Python/venv/env/test ignores and generated CSV ignore.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +61 to +63
if session_id not in _sessions:
_sessions[session_id] = ChatBot()

Copy link

Copilot AI Mar 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_get_or_create_session() constructs a brand-new ChatBot() for any new session. In pattern mode this can trigger Kaggle download / CSV training inside ChatBot.__init__, making first requests for new sessions potentially very slow and repeated. Consider keeping a pre-trained template (or a shared SimpleBot/responses) and instantiating per-session bots without re-downloading/re-training.

Copilot uses AI. Check for mistakes.
api.py Outdated
Comment on lines +140 to +146
# Retrain a fresh bot and store it as the template for new sessions
fresh_bot = ChatBot(dialog_file=request.dialog_file, force_offline=True)
patterns = fresh_bot.pattern_count

# Clear all existing sessions so next requests pick up new data
_sessions.clear()

Copy link

Copilot AI Mar 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/train creates fresh_bot from dialog_file but never stores it anywhere, so future sessions created via ChatBot() in _get_or_create_session() will not actually use the newly provided training data. If the intent is to retrain the pattern backend for subsequent sessions, persist the trained data/template (e.g., keep a module-level default bot or default dialog_file used when creating new sessions).

Copilot uses AI. Check for mistakes.
Comment on lines +128 to +146
"""Reload pattern-matching data from a CSV file on the server.

The file must exist on the server filesystem. This endpoint only
affects future sessions created after the reload (existing sessions
keep their current bot instance).
"""
if not os.path.exists(request.dialog_file):
raise HTTPException(
status_code=404,
detail=f"File not found: {request.dialog_file}",
)

# Retrain a fresh bot and store it as the template for new sessions
fresh_bot = ChatBot(dialog_file=request.dialog_file, force_offline=True)
patterns = fresh_bot.pattern_count

# Clear all existing sessions so next requests pick up new data
_sessions.clear()

Copy link

Copilot AI Mar 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The /train docstring says existing sessions “keep their current bot instance”, but the implementation clears _sessions, which deletes all existing sessions. Either update the docstring to match the behavior or change the logic to preserve existing session bots while only affecting newly created sessions.

Copilot uses AI. Check for mistakes.
Comment on lines +134 to +138
if not os.path.exists(request.dialog_file):
raise HTTPException(
status_code=404,
detail=f"File not found: {request.dialog_file}",
)
Copy link

Copilot AI Mar 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/train accepts an arbitrary filesystem path from the client and returns an error that echoes the full path. This enables filesystem probing and can let remote clients trigger reads of unintended server-local files. Consider restricting training files to an allowlisted directory, accepting uploads instead of paths, and avoiding echoing the raw path in the error message.

Copilot uses AI. Check for mistakes.
Comment on lines +268 to +275
with open(temp_csv_path, "w", encoding="utf-8") as fh:
fh.write("dialog_id,line_id,text\n")
for i in range(0, len(lines) - 1, 2):
q, a = lines[i].strip(), lines[i + 1].strip()
if q and a:
dialog_id += 1
fh.write(f"{dialog_id},1,{q}\n")
fh.write(f"{dialog_id},2,{a}\n")
Copy link

Copilot AI Mar 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_parse_kaggle_dialogs() writes CSV rows via string interpolation without escaping. If a question/answer contains commas, quotes, or newlines, the generated converted_dialog.csv will be malformed and training will mis-parse rows. Use csv.writer (with proper quoting) to emit the CSV instead of manual fh.write(...) formatting.

Suggested change
with open(temp_csv_path, "w", encoding="utf-8") as fh:
fh.write("dialog_id,line_id,text\n")
for i in range(0, len(lines) - 1, 2):
q, a = lines[i].strip(), lines[i + 1].strip()
if q and a:
dialog_id += 1
fh.write(f"{dialog_id},1,{q}\n")
fh.write(f"{dialog_id},2,{a}\n")
with open(temp_csv_path, "w", encoding="utf-8", newline="") as fh:
writer = csv.writer(fh)
writer.writerow(["dialog_id", "line_id", "text"])
for i in range(0, len(lines) - 1, 2):
q, a = lines[i].strip(), lines[i + 1].strip()
if q and a:
dialog_id += 1
writer.writerow([dialog_id, 1, q])
writer.writerow([dialog_id, 2, a])

Copilot uses AI. Check for mistakes.
Comment on lines +8 to +9
import os
import tempfile
Copy link

Copilot AI Mar 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

test_chatbot.py has unused imports (os, tempfile, and the top-level TestClient is re-imported inside the fixture). This adds noise and can trigger flake8 F401 if linting is tightened. Consider removing unused imports and keeping a single TestClient import.

Suggested change
import os
import tempfile

Copilot uses AI. Check for mistakes.
data = resp.json()
assert data["status"] == "retrained"
assert data["patterns_loaded"] > 0

Copy link

Copilot AI Mar 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current FastAPI tests don’t verify that /train actually affects subsequent /chat sessions (e.g., that a new session after training responds using the newly provided dialog file). Adding a test for this would catch regressions and would have surfaced the missing persistence of fresh_bot/dialog data.

Suggested change
def test_train_affects_new_chat_session(self, api_client):
client, _ = api_client
# Create a new dialog file with a distinctive response to "hello"
new_dialog_content = """\
dialog_id,line_id,text
1,1,hello
1,2,Hello from NEW dialog!
"""
tmp = tempfile.NamedTemporaryFile(mode="w", suffix=".csv", delete=False)
try:
tmp.write(new_dialog_content)
tmp.flush()
tmp_path = tmp.name
finally:
tmp.close()
try:
# Retrain the API using the new dialog file
resp = client.post("/train", json={"dialog_file": tmp_path})
assert resp.status_code == 200
data = resp.json()
assert data["status"] == "retrained"
assert data["patterns_loaded"] > 0
# Start a new chat session (no session_id) and ensure the reply
# reflects the newly trained dialog file.
chat_resp = client.post("/chat", json={"message": "hello"})
assert chat_resp.status_code == 200
chat_data = chat_resp.json()
assert "reply" in chat_data
assert "Hello from NEW dialog" in chat_data["reply"]
finally:
# Clean up the temporary dialog file
if os.path.exists(tmp_path):
os.remove(tmp_path)

Copilot uses AI. Check for mistakes.
web_demo.py Outdated
Comment on lines +42 to +50
@st.cache_resource
def _get_bot_factory() -> ChatBot:
"""Return a template bot (loads data once); actual per-session bots copy from this."""
return ChatBot()


# Per-session bot stored in session_state so each browser tab/user gets its own history
if "bot" not in st.session_state:
st.session_state.bot = _get_bot_factory()
Copy link

Copilot AI Mar 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@st.cache_resource causes _get_bot_factory() to return the same ChatBot instance across Streamlit sessions/users. Because ChatBot holds conversation history, this can leak context between different browser tabs/users and contradicts the comment about “per-session bots”. Consider caching only the training data (or a factory function) and creating a fresh ChatBot per st.session_state, or ensure the cached object is immutable and per-session bots have independent history.

Suggested change
@st.cache_resource
def _get_bot_factory() -> ChatBot:
"""Return a template bot (loads data once); actual per-session bots copy from this."""
return ChatBot()
# Per-session bot stored in session_state so each browser tab/user gets its own history
if "bot" not in st.session_state:
st.session_state.bot = _get_bot_factory()
# Per-session bot stored in session_state so each browser tab/user gets its own history
if "bot" not in st.session_state:
st.session_state.bot = ChatBot()

Copilot uses AI. Check for mistakes.
@joshuvavinith
Copy link
Owner

@copilot Resolve the conflicts

…atbot.py, web_demo.py and test_chatbot.py

Co-authored-by: joshuvavinith <146979257+joshuvavinith@users.noreply.github.com>
Copy link
Contributor Author

Copilot AI commented Mar 7, 2026

@copilot Resolve the conflicts

Conflicts resolved in commit 70afb69. The three files that diverged from main (.github/workflows/ci.yml, README.md, requirements.txt) have been brought up to date with the latest main. I also addressed the automated review feedback at the same time:

  • api.py — Added _default_dialog_file module variable so /train persists the retrained data for all subsequent new sessions; fixed the /train docstring and removed the file path from the error response.
  • ai_chatbot.py_parse_kaggle_dialogs now uses csv.writer for proper quoting instead of string interpolation.
  • web_demo.py — Removed the @st.cache_resource factory pattern that would have leaked conversation history between different browser users.
  • test_chatbot.py — Added test_train_affects_new_chat_session to verify that a /train call actually influences new sessions; removed unused imports.

All 20 tests pass and flake8 reports 0 errors.

Warning

Firewall rules blocked me from connecting to one or more addresses (expand for details)

I tried to connect to the following addresses, but was blocked by firewall rules:

  • api.kaggle.com
    • Triggering command: /home/REDACTED/.local/bin/pytest pytest test_chatbot.py -v --tb=short (dns block)

If you need me to access, download, or install something from one of these locations, you can either:

@joshuvavinith
Copy link
Owner

@copilot some CI tests has failed, find and resolve

…ges reverted by merge commit

Co-authored-by: joshuvavinith <146979257+joshuvavinith@users.noreply.github.com>
Copy link
Contributor Author

Copilot AI commented Mar 7, 2026

@copilot some CI tests has failed, find and resolve

Fixed in commit 741ac64. The CI failure was caused by the merge commit f2395b4 ("Merge branch 'main' into copilot/fix-failed-tasks") which reverted three previously fixed files. The test test_train_affects_new_chat_session was failing in Python 3.11 because:

  • api.py — the _default_dialog_file module variable and all related code was stripped out, so after calling /train, new sessions created by /chat fell through to the Kaggle dataset (which was cached on the CI runner) instead of using the retrained dialog file
  • test_chatbot.py — the unused top-level TestClient import was re-added (flake8 F401)
  • web_demo.py — the @st.cache_resource factory pattern (history leak across users) and import os (unused) were re-added

The fix restores all three files. Additionally, _get_or_create_session now passes force_offline=True when a custom dialog file has been configured via /train, which prevents any Kaggle fallback even if the dataset is pre-cached on the runner.

Warning

Firewall rules blocked me from connecting to one or more addresses (expand for details)

I tried to connect to the following addresses, but was blocked by firewall rules:

  • api.kaggle.com
    • Triggering command: /home/REDACTED/.local/bin/pytest pytest test_chatbot.py -v --tb=short (dns block)
    • Triggering command: /home/REDACTED/.local/bin/pytest pytest test_chatbot.py -v --tb=short --uid-owner 0 -j ACCEPT (dns block)

If you need me to access, download, or install something from one of these locations, you can either:

@joshuvavinith joshuvavinith marked this pull request as ready for review March 7, 2026 08:38
@joshuvavinith joshuvavinith merged commit 6b72556 into main Mar 7, 2026
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants