Fix flake8 F824 in api.py; add FastAPI backend, Streamlit UI, tests, and CI by Copilot · Pull Request #3 · joshuvavinith/AI_ChatBot

Copilot · 2026-03-06T20:14:20Z

💬 We'd love your input! Share your thoughts on Copilot coding agent in our 2 minute survey.

… api.py Co-authored-by: joshuvavinith <146979257+joshuvavinith@users.noreply.github.com>

Copilot

Pull request overview

This PR expands the project from a Tkinter-only chatbot into a multi-surface app (FastAPI backend + Streamlit UI) while addressing a flake8 F824 CI failure and adding automated tests + CI.

Changes:

Introduces a FastAPI service with session-based chat, health, training, and session reset endpoints.
Refactors chatbot logic into SimpleBot (offline) + LLMBot (OpenAI) behind a ChatBot facade; isolates Tkinter GUI behind __main__.
Adds a Streamlit demo UI, pytest coverage, Docker image modes, and a GitHub Actions CI pipeline.

Reviewed changes

Copilot reviewed 6 out of 8 changed files in this pull request and generated 8 comments.

Show a summary per file

File	Description
`ai_chatbot.py`	Splits core chatbot into offline + LLM backends with a unified facade; keeps GUI runnable via `__main__`.
`api.py`	Adds FastAPI REST API with in-memory session storage and retrain endpoint.
`web_demo.py`	Adds Streamlit chat UI with session state and optional LLM streaming.
`test_chatbot.py`	Adds pytest suite for `SimpleBot`, `ChatBot`, and API endpoints via `TestClient`.
`requirements.txt`	Adds dependencies for FastAPI, Streamlit, OpenAI, dotenv, and testing.
`Dockerfile`	Adds container build with runtime switch between API and Streamlit modes.
`.github/workflows/ci.yml`	Adds lint → test → docker build workflow.
`.gitignore`	Adds common Python/venv/env/test ignores and generated CSV ignore.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-03-07T07:24:53Z

api.py

+    if session_id not in _sessions:
+        _sessions[session_id] = ChatBot()
+


_get_or_create_session() constructs a brand-new ChatBot() for any new session. In pattern mode this can trigger Kaggle download / CSV training inside ChatBot.__init__, making first requests for new sessions potentially very slow and repeated. Consider keeping a pre-trained template (or a shared SimpleBot/responses) and instantiating per-session bots without re-downloading/re-training.

Copilot · 2026-03-07T07:24:53Z

api.py

+    # Retrain a fresh bot and store it as the template for new sessions
+    fresh_bot = ChatBot(dialog_file=request.dialog_file, force_offline=True)
+    patterns = fresh_bot.pattern_count
+
+    # Clear all existing sessions so next requests pick up new data
+    _sessions.clear()
+


/train creates fresh_bot from dialog_file but never stores it anywhere, so future sessions created via ChatBot() in _get_or_create_session() will not actually use the newly provided training data. If the intent is to retrain the pattern backend for subsequent sessions, persist the trained data/template (e.g., keep a module-level default bot or default dialog_file used when creating new sessions).

Copilot · 2026-03-07T07:24:53Z

api.py

+    """Reload pattern-matching data from a CSV file on the server.
+
+    The file must exist on the server filesystem.  This endpoint only
+    affects future sessions created after the reload (existing sessions
+    keep their current bot instance).
+    """
+    if not os.path.exists(request.dialog_file):
+        raise HTTPException(
+            status_code=404,
+            detail=f"File not found: {request.dialog_file}",
+        )
+
+    # Retrain a fresh bot and store it as the template for new sessions
+    fresh_bot = ChatBot(dialog_file=request.dialog_file, force_offline=True)
+    patterns = fresh_bot.pattern_count
+
+    # Clear all existing sessions so next requests pick up new data
+    _sessions.clear()
+


The /train docstring says existing sessions “keep their current bot instance”, but the implementation clears _sessions, which deletes all existing sessions. Either update the docstring to match the behavior or change the logic to preserve existing session bots while only affecting newly created sessions.

Copilot · 2026-03-07T07:24:54Z

api.py

+    if not os.path.exists(request.dialog_file):
+        raise HTTPException(
+            status_code=404,
+            detail=f"File not found: {request.dialog_file}",
+        )


/train accepts an arbitrary filesystem path from the client and returns an error that echoes the full path. This enables filesystem probing and can let remote clients trigger reads of unintended server-local files. Consider restricting training files to an allowlisted directory, accepting uploads instead of paths, and avoiding echoing the raw path in the error message.

Copilot · 2026-03-07T07:24:54Z

ai_chatbot.py

+        with open(temp_csv_path, "w", encoding="utf-8") as fh:
+            fh.write("dialog_id,line_id,text\n")
+            for i in range(0, len(lines) - 1, 2):
+                q, a = lines[i].strip(), lines[i + 1].strip()
+                if q and a:
+                    dialog_id += 1
+                    fh.write(f"{dialog_id},1,{q}\n")
+                    fh.write(f"{dialog_id},2,{a}\n")


_parse_kaggle_dialogs() writes CSV rows via string interpolation without escaping. If a question/answer contains commas, quotes, or newlines, the generated converted_dialog.csv will be malformed and training will mis-parse rows. Use csv.writer (with proper quoting) to emit the CSV instead of manual fh.write(...) formatting.

Suggested change

with open(temp_csv_path, "w", encoding="utf-8") as fh:

fh.write("dialog_id,line_id,text\n")

for i in range(0, len(lines) - 1, 2):

q, a = lines[i].strip(), lines[i + 1].strip()

if q and a:

dialog_id += 1

fh.write(f"{dialog_id},1,{q}\n")

fh.write(f"{dialog_id},2,{a}\n")

with open(temp_csv_path, "w", encoding="utf-8", newline="") as fh:

writer = csv.writer(fh)

writer.writerow(["dialog_id", "line_id", "text"])

for i in range(0, len(lines) - 1, 2):

q, a = lines[i].strip(), lines[i + 1].strip()

if q and a:

dialog_id += 1

writer.writerow([dialog_id, 1, q])

writer.writerow([dialog_id, 2, a])

Copilot · 2026-03-07T07:24:54Z

test_chatbot.py

+import os
+import tempfile


test_chatbot.py has unused imports (os, tempfile, and the top-level TestClient is re-imported inside the fixture). This adds noise and can trigger flake8 F401 if linting is tightened. Consider removing unused imports and keeping a single TestClient import.

Suggested change

import os

import tempfile

Copilot · 2026-03-07T07:24:54Z

test_chatbot.py

+        data = resp.json()
+        assert data["status"] == "retrained"
+        assert data["patterns_loaded"] > 0
+


The current FastAPI tests don’t verify that /train actually affects subsequent /chat sessions (e.g., that a new session after training responds using the newly provided dialog file). Adding a test for this would catch regressions and would have surfaced the missing persistence of fresh_bot/dialog data.

Suggested change

def test_train_affects_new_chat_session(self, api_client):

client, _ = api_client

# Create a new dialog file with a distinctive response to "hello"

new_dialog_content = """\

dialog_id,line_id,text

1,1,hello

1,2,Hello from NEW dialog!

"""

tmp = tempfile.NamedTemporaryFile(mode="w", suffix=".csv", delete=False)

try:

tmp.write(new_dialog_content)

tmp.flush()

tmp_path = tmp.name

finally:

tmp.close()

try:

# Retrain the API using the new dialog file

resp = client.post("/train", json={"dialog_file": tmp_path})

assert resp.status_code == 200

data = resp.json()

assert data["status"] == "retrained"

assert data["patterns_loaded"] > 0

# Start a new chat session (no session_id) and ensure the reply

# reflects the newly trained dialog file.

chat_resp = client.post("/chat", json={"message": "hello"})

assert chat_resp.status_code == 200

chat_data = chat_resp.json()

assert "reply" in chat_data

assert "Hello from NEW dialog" in chat_data["reply"]

finally:

# Clean up the temporary dialog file

if os.path.exists(tmp_path):

os.remove(tmp_path)

Copilot · 2026-03-07T07:24:54Z

web_demo.py

+@st.cache_resource
+def _get_bot_factory() -> ChatBot:
+    """Return a template bot (loads data once); actual per-session bots copy from this."""
+    return ChatBot()
+
+
+# Per-session bot stored in session_state so each browser tab/user gets its own history
+if "bot" not in st.session_state:
+    st.session_state.bot = _get_bot_factory()


@st.cache_resource causes _get_bot_factory() to return the same ChatBot instance across Streamlit sessions/users. Because ChatBot holds conversation history, this can leak context between different browser tabs/users and contradicts the comment about “per-session bots”. Consider caching only the training data (or a factory function) and creating a fresh ChatBot per st.session_state, or ensure the cached object is immutable and per-session bots have independent history.

Suggested change

@st.cache_resource

def _get_bot_factory() -> ChatBot:

"""Return a template bot (loads data once); actual per-session bots copy from this."""

return ChatBot()

# Per-session bot stored in session_state so each browser tab/user gets its own history

if "bot" not in st.session_state:

st.session_state.bot = _get_bot_factory()

# Per-session bot stored in session_state so each browser tab/user gets its own history

if "bot" not in st.session_state:

st.session_state.bot = ChatBot()

joshuvavinith · 2026-03-07T07:28:07Z

@copilot Resolve the conflicts

…atbot.py, web_demo.py and test_chatbot.py Co-authored-by: joshuvavinith <146979257+joshuvavinith@users.noreply.github.com>

Copilot · 2026-03-07T07:35:26Z

@copilot Resolve the conflicts

Conflicts resolved in commit 70afb69. The three files that diverged from main (.github/workflows/ci.yml, README.md, requirements.txt) have been brought up to date with the latest main. I also addressed the automated review feedback at the same time:

api.py — Added _default_dialog_file module variable so /train persists the retrained data for all subsequent new sessions; fixed the /train docstring and removed the file path from the error response.
ai_chatbot.py — _parse_kaggle_dialogs now uses csv.writer for proper quoting instead of string interpolation.
web_demo.py — Removed the @st.cache_resource factory pattern that would have leaked conversation history between different browser users.
test_chatbot.py — Added test_train_affects_new_chat_session to verify that a /train call actually influences new sessions; removed unused imports.

All 20 tests pass and flake8 reports 0 errors.

Warning

Firewall rules blocked me from connecting to one or more addresses (expand for details)

I tried to connect to the following addresses, but was blocked by firewall rules:

api.kaggle.com
- Triggering command: /home/REDACTED/.local/bin/pytest pytest test_chatbot.py -v --tb=short (dns block)

If you need me to access, download, or install something from one of these locations, you can either:

Configure Actions setup steps to set up my environment, which run before the firewall is enabled
Add the appropriate URLs or hosts to the custom allowlist in this repository's Copilot coding agent settings (admins only)

joshuvavinith · 2026-03-07T08:23:12Z

@copilot some CI tests has failed, find and resolve

…ges reverted by merge commit Co-authored-by: joshuvavinith <146979257+joshuvavinith@users.noreply.github.com>

Copilot · 2026-03-07T08:34:46Z

@copilot some CI tests has failed, find and resolve

Fixed in commit 741ac64. The CI failure was caused by the merge commit f2395b4 ("Merge branch 'main' into copilot/fix-failed-tasks") which reverted three previously fixed files. The test test_train_affects_new_chat_session was failing in Python 3.11 because:

api.py — the _default_dialog_file module variable and all related code was stripped out, so after calling /train, new sessions created by /chat fell through to the Kaggle dataset (which was cached on the CI runner) instead of using the retrained dialog file
test_chatbot.py — the unused top-level TestClient import was re-added (flake8 F401)
web_demo.py — the @st.cache_resource factory pattern (history leak across users) and import os (unused) were re-added

The fix restores all three files. Additionally, _get_or_create_session now passes force_offline=True when a custom dialog file has been configured via /train, which prevents any Kaggle fallback even if the dataset is pre-cached on the runner.

Warning

Firewall rules blocked me from connecting to one or more addresses (expand for details)

I tried to connect to the following addresses, but was blocked by firewall rules:

api.kaggle.com
- Triggering command: /home/REDACTED/.local/bin/pytest pytest test_chatbot.py -v --tb=short (dns block)
- Triggering command: /home/REDACTED/.local/bin/pytest pytest test_chatbot.py -v --tb=short --uid-owner 0 -j ACCEPT (dns block)

If you need me to access, download, or install something from one of these locations, you can either:

Configure Actions setup steps to set up my environment, which run before the firewall is enabled
Add the appropriate URLs or hosts to the custom allowlist in this repository's Copilot coding agent settings (admins only)

Initial plan

a58b552

Copilot AI assigned Copilot and joshuvavinith Mar 6, 2026

Copilot started work on behalf of joshuvavinith March 6, 2026 20:14 View session

Fix CI failures: add new chatbot modules and fix flake8 F824 error in…

33e1b68

… api.py Co-authored-by: joshuvavinith <146979257+joshuvavinith@users.noreply.github.com>

Copilot AI changed the title ~~[WIP] Fix failures in task execution~~ Fix flake8 F824 in api.py; add FastAPI backend, Streamlit UI, tests, and CI Mar 6, 2026

Copilot finished work on behalf of joshuvavinith March 6, 2026 20:21

joshuvavinith requested a review from Copilot March 7, 2026 07:19

Copilot started reviewing on behalf of joshuvavinith March 7, 2026 07:20 View session

Copilot AI reviewed Mar 7, 2026

View reviewed changes

Copilot started work on behalf of joshuvavinith March 7, 2026 07:28 View session

Resolve conflicts with main; address review feedback in api.py, ai_ch…

70afb69

…atbot.py, web_demo.py and test_chatbot.py Co-authored-by: joshuvavinith <146979257+joshuvavinith@users.noreply.github.com>

Copilot finished work on behalf of joshuvavinith March 7, 2026 07:36

Merge branch 'main' into copilot/fix-failed-tasks

f2395b4

Copilot started work on behalf of joshuvavinith March 7, 2026 08:23 View session

Fix CI failure: restore _default_dialog_file in api.py and other chan…

741ac64

…ges reverted by merge commit Co-authored-by: joshuvavinith <146979257+joshuvavinith@users.noreply.github.com>

Copilot finished work on behalf of joshuvavinith March 7, 2026 08:35

joshuvavinith marked this pull request as ready for review March 7, 2026 08:38

joshuvavinith merged commit 6b72556 into main Mar 7, 2026
10 checks passed

		if session_id not in _sessions:
		_sessions[session_id] = ChatBot()

+    def test_train_affects_new_chat_session(self, api_client):
+        client, _ = api_client
+        # Create a new dialog file with a distinctive response to "hello"
+        new_dialog_content = """\
+dialog_id,line_id,text
+,1,hello
+,2,Hello from NEW dialog!
+"""
+        tmp = tempfile.NamedTemporaryFile(mode="w", suffix=".csv", delete=False)
+        try:
+            tmp.write(new_dialog_content)
+            tmp.flush()
+            tmp_path = tmp.name
+        finally:
+            tmp.close()
+        try:
+            # Retrain the API using the new dialog file
+            resp = client.post("/train", json={"dialog_file": tmp_path})
+            assert resp.status_code == 200
+            data = resp.json()
+            assert data["status"] == "retrained"
+            assert data["patterns_loaded"] > 0
+            # Start a new chat session (no session_id) and ensure the reply
+            # reflects the newly trained dialog file.
+            chat_resp = client.post("/chat", json={"message": "hello"})
+            assert chat_resp.status_code == 200
+            chat_data = chat_resp.json()
+            assert "reply" in chat_data
+            assert "Hello from NEW dialog" in chat_data["reply"]
+        finally:
+            # Clean up the temporary dialog file
+            if os.path.exists(tmp_path):
+                os.remove(tmp_path)

Conversation

Copilot AI commented Mar 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Mar 7, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 7, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 7, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 7, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 7, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 7, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 7, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 7, 2026

Choose a reason for hiding this comment

Uh oh!

joshuvavinith commented Mar 7, 2026

Uh oh!

Copilot AI commented Mar 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

I tried to connect to the following addresses, but was blocked by firewall rules:

Uh oh!

joshuvavinith commented Mar 7, 2026

Uh oh!

Copilot AI commented Mar 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

I tried to connect to the following addresses, but was blocked by firewall rules:

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Copilot AI commented Mar 6, 2026 •

edited

Loading

Copilot AI commented Mar 7, 2026 •

edited

Loading

Copilot AI commented Mar 7, 2026 •

edited

Loading