Skip to content

[FEAT]: Adding line snippets from transcript where each field is found as an Audit Trail #297

@Cubix33

Description

@Cubix33

📝 Description

When the AI fills out a form, it currently just drops the answer into the PDF without explaining where it got the information. We need a way to see the AI's "homework." This feature makes the AI extract the exact quote from the audio transcript that it used to find the answer. It then creates a separate Audit Trail text file alongside the filled PDF so humans can easily verify the AI's work.

💡 Rationale

FireForm handles high-stakes emergency reporting. If an incident report is ever questioned in court or by a fire chief, officials must be able to see the raw text that justified the AI's decision. We cannot have a "black box" where data magically appears.

🛠️ Proposed Solution

We will force the LLM to output its answers as a strict dictionary containing both the final answer and the source quote. Then, we will generate a clean .txt file logging this data.

  • Logic change in src/llm.py: Update build_prompt to demand a strict JSON format with "value" and "quote" keys.
  • Logic change in src/llm.py: Add a safety check in add_response_to_json to handle cases where the LLM mistakenly returns a Python list instead of a string (joining lists with a ; to prevent crashes).
  • Logic change in src/filler.py: Extract only the "value" to fill the PDF fields, then write a formatted _audit.txt file that lists every field alongside its AI value and transcript source quote.

✅ Acceptance Criteria

  • Prompt Adherence: The LLM successfully returns both the value and the quote for every field.
  • Crash Prevention: If the LLM hallucinates a JSON array (e.g., ["John", "Jane"]), the system catches it and converts it safely without throwing an AttributeError.
  • Audit File Generation: A clean _audit.txt file is saved in the same directory as the output PDF, containing readable timestamps, fields, values, and quotes.

📌 Additional Context

We specifically chose to generate a .txt file instead of drawing a new page inside the PDF. This avoids adding heavy new dependencies (like reportlab) to our Docker image, keeping the application lightweight, fast, and completely free of external library bloat. Furthermore, a .txt file is much easier for a fire department's database system to parse later compared to extracting text from a PDF graphic layer.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions