Skip to content

Error handling for Grobid when not responding#35

Open
Sanakhamassi wants to merge 1 commit intoScienciaLAB:mainfrom
Sanakhamassi:fix/Display-an-error-message-when-grobid-is-not-responding
Open

Error handling for Grobid when not responding#35
Sanakhamassi wants to merge 1 commit intoScienciaLAB:mainfrom
Sanakhamassi:fix/Display-an-error-message-when-grobid-is-not-responding

Conversation

@Sanakhamassi
Copy link
Copy Markdown

@Sanakhamassi Sanakhamassi commented Apr 23, 2026

Related to issue #11

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds explicit error handling for Grobid failures so the Streamlit UI can surface a clear “please try later” message instead of failing ambiguously (issue #11).

Changes:

  • Introduce GrobidServiceError and raise it when Grobid errors or returns non-200.
  • Catch GrobidServiceError in the Streamlit upload/embedding flow and display an error message.
  • Add a (currently redundant) guard in DocumentQAEngine for missing Grobid output.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 5 comments.

File Description
streamlit_app.py Catches Grobid failures during embedding creation and shows a user-facing error.
document_qa/grobid_processors.py Defines GrobidServiceError and raises it from Grobid processing on failure/non-200.
document_qa/document_qa_engine.py Imports/raises GrobidServiceError when Grobid structure is missing.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread streamlit_app.py
Comment on lines +331 to +335
status = f" (status {exc.status_code})" if exc.status_code else ""
st.session_state['doc_id'] = None
st.session_state['loaded_embeddings'] = False
st.session_state['uploaded'] = False
st.error(f"Grobid is not responding{status}. Please try later.")
Comment thread streamlit_app.py
Comment thread document_qa/grobid_processors.py
Comment thread document_qa/document_qa_engine.py
Comment on lines 107 to +125
def process_structure(self, input_path, coordinates=False):
pdf_file, status, text = self.grobid_client.process_pdf("processFulltextDocument",
input_path,
consolidate_header=True,
consolidate_citations=False,
segment_sentences=False,
tei_coordinates=coordinates,
include_raw_citations=False,
include_raw_affiliations=False,
generateIDs=True)
try:
pdf_file, status, text = self.grobid_client.process_pdf("processFulltextDocument",
input_path,
consolidate_header=True,
consolidate_citations=False,
segment_sentences=False,
tei_coordinates=coordinates,
include_raw_citations=False,
include_raw_affiliations=False,
generateIDs=True)
except Exception as exc:
raise GrobidServiceError("Grobid service did not respond.") from exc

if status != 200:
return
raise GrobidServiceError(
f"Grobid service returned status {status}.",
status_code=status
)
Copy link
Copy Markdown
Collaborator

@lfoppiano lfoppiano left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a missing space, the rest looks fine. I did not test it, so please make sure you tested it before merge/squash.

Comment thread streamlit_app.py
st.session_state['doc_id'] = None
st.session_state['loaded_embeddings'] = False
st.session_state['uploaded'] = False
st.error(f"Grobid is not responding{status}. Please try later.")
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here there seems to be a missing space

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants