PayFlow Hallucination-Controlled Billing Assistant

Project Overview
Problem Statement
Why Hallucinations Are Dangerous in Billing Systems
Project Goal
Model & Training Strategy
Dataset Design
Policy-Driven Training Approach
Training Process (Step-by-Step)
Evaluation Methodology
Results Summary
Challenges Faced
Key Learnings
Limitations & Future Improvements
Training Configuration Notes
Conclusion

1. Project Overview

This project focuses on reducing hallucinations in a Large Language Model (LLM) used for SaaS billing customer support.
The assistant is designed for a fictional company called PayFlow and must answer only using official billing policies.

The model is fine-tuned using LoRA (Low-Rank Adaptation) on top of a base instruction-tuned LLM.

2. Problem Statement

Large Language Models often produce confident but incorrect answers, known as hallucinations.
In billing and payments systems, hallucinations can cause:

Financial loss
Legal issues
Loss of customer trust

This project addresses the question:

How can we constrain an LLM to answer strictly from approved billing policies and safely refuse when information is unavailable?

3. Why Hallucinations Are Dangerous in Billing Systems

Examples of real-world risks:

Inventing refund policies
Claiming unsupported payment methods
Fabricating discounts or currencies

In enterprise systems, safe refusal is better than a wrong answer.

4. Project Goal

The goal of this project is to:

Reduce hallucinations in billing-related questions
Ensure strict policy adherence
Teach the model when to refuse instead of guessing
Quantify hallucination reduction using before/after evaluation

5. Model & Training Strategy

Base Model

Instruction-tuned large language model (Mistral-style architecture)

Fine-Tuning Method

LoRA (Low-Rank Adaptation)
Only a small percentage of parameters are trained
Base model weights remain unchanged

Why LoRA?

Memory efficient
Faster training
Prevents catastrophic forgetting
Industry-standard for alignment tasks

6. Dataset Design

The model is not trained directly on policy documents.

Instead:

billing.md acts as the human source of truth
Policies are manually converted into instruction–response pairs
Only information present in the policy is allowed

Dataset Rules

One concept → one behavior
Explicit refusals for unknown information
No assumptions or industry defaults

7. Policy-Driven Training Approach

The model is trained to follow this rule:

Answer only what is explicitly stated in PayFlow’s billing policy.
If information is missing, respond with a standard refusal.

Standard refusal phrase:

This information is not available in PayFlow’s billing policy.

This consistency is critical for hallucination control.

8. Training Process (Step-by-Step)

Environment setup in Google Colab
Load base model in 4-bit precision
Attach LoRA adapters
Load curated dataset (train.json)
Fine-tune LoRA adapters for 2–3 epochs
Save LoRA adapter weights only

No full model retraining is performed.

9. Evaluation Methodology

Evaluation is performed using:

Known policy questions
Edge cases
Trap questions (questions not covered in policy)

Each response is classified as:

✅ Correct
⚠️ Safe but imperfect (over-refusal / verbosity)
❌ Hallucination (fabricated information)

10. Results Summary

Hallucination rate before training: ~60–70%
Hallucination rate after training: ~8–10%
Approximate reduction: 75–90%

Detailed before/after comparisons are documented separately in RESULTS.md.

11. Challenges Faced

1. Partial Alignment Overconfidence

The model initially hallucinated more confidently after partial fine-tuning.

2. Over-Refusal

Excessive refusal occurred when refusal samples outweighed valid answers.

3. Dataset Contradictions

Similar questions with different expected behaviors caused instability.

Each issue was resolved through dataset normalization and retraining.

12. Key Learnings

Hallucination reduction is a data problem, not a model problem
Models hallucinate where policies are silent
Over-refusal is safer than hallucination but must be balanced
Alignment is iterative, not one-shot

13. Limitations & Future Improvements

Some verbosity and response blending remains
Further reduction (<5%) would require larger datasets
Automated policy-to-dataset generation could improve scalability

14. Training Configuration Notes

LoRA adapters were successfully trained over 3 epochs, with training loss decreasing steadily, indicating effective policy alignment.

Training completed in 3 epochs with final training loss ≈ 1.14.

15. Conclusion

This project demonstrates a practical, enterprise-grade approach to hallucination control using LoRA.

Rather than chasing perfect answers, the model is trained to:

Respect policy boundaries
Avoid guessing
Fail safely

This approach mirrors how real-world AI systems are deployed in billing and finance domains.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
payflow-hallucination-control		payflow-hallucination-control
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PayFlow Hallucination-Controlled Billing Assistant

Table of Contents

1. Project Overview

2. Problem Statement

3. Why Hallucinations Are Dangerous in Billing Systems

4. Project Goal

5. Model & Training Strategy

Base Model

Fine-Tuning Method

Why LoRA?

6. Dataset Design

Dataset Rules

7. Policy-Driven Training Approach

8. Training Process (Step-by-Step)

9. Evaluation Methodology

10. Results Summary

11. Challenges Faced

1. Partial Alignment Overconfidence

2. Over-Refusal

3. Dataset Contradictions

12. Key Learnings

13. Limitations & Future Improvements

14. Training Configuration Notes

15. Conclusion

About

Uh oh!

Releases

Packages

Languages

Swathi-88/Pay_Flow

Folders and files

Latest commit

History

Repository files navigation

PayFlow Hallucination-Controlled Billing Assistant

Table of Contents

1. Project Overview

2. Problem Statement

3. Why Hallucinations Are Dangerous in Billing Systems

4. Project Goal

5. Model & Training Strategy

Base Model

Fine-Tuning Method

Why LoRA?

6. Dataset Design

Dataset Rules

7. Policy-Driven Training Approach

8. Training Process (Step-by-Step)

9. Evaluation Methodology

10. Results Summary

11. Challenges Faced

1. Partial Alignment Overconfidence

2. Over-Refusal

3. Dataset Contradictions

12. Key Learnings

13. Limitations & Future Improvements

14. Training Configuration Notes

15. Conclusion

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages