A full-stack academic integrity platform built as a BSc IT Final Year Project
EduShield is a comprehensive academic integrity system that detects plagiarism and AI-generated content in student documents. It supports two distinct workflows β a classroom-based FacultyβStudent pipeline for institutional use, and a Personal Document Checker for individual users β making it equally useful for educators and independent researchers.
| Feature | Student | Faculty | Personal User |
|---|---|---|---|
| Upload Documents | β | β | β |
| Plagiarism Detection | β | β | β |
| AI Content Detection | β | β | β |
| Downloadable PDF Report | β | β | β |
| View Assignment Feedback | β | β | β |
| Manage Sections | β | β | β |
| Review Submissions | β | β | β |
- Secure JWT-based authentication
- Three distinct roles: Faculty, Student, Personal User
- Role-based UI and API access control
- Create classroom sections with unique access codes
- Upload reference assignments per section
- View and review all student submissions
- Detect:
- Student-to-reference plagiarism
- Student-to-student cross-comparison
- Add remarks, grades, and resubmission requests
- Join sections via access code
- Upload assignments (.pdf / .docx)
- View plagiarism percentage and AI detection results
- Track submission attempts, deadlines, and faculty feedback
Upload any academic document and get a detailed integrity report:
- Domain-based analysis: Computer Science Β· Science Β· Commerce Β· General
- Fingerprint-based plagiarism detection against a public domain corpus
- AI content detection using a transformer model
- Severity classification: Low Β· Medium Β· High
- Highlighted matched phrases in results
- Downloadable PDF report β professional and shareable
Text Normalization β Word Shingling β SHA-256 Hashing β Corpus Comparison β Similarity Score
- Text is cleaned and broken into overlapping word shingles
- Each shingle is hashed using SHA-256
- Hashes are compared against a domain-specific public corpus
- Similarity is derived from the fingerprint overlap ratio
| Similarity Score | Severity | Interpretation |
|---|---|---|
| β€ 15% | π’ Low | Common academic phrasing |
| 16β40% | π‘ Medium | Partial content resemblance |
| > 40% | π΄ High | Strong overlap β likely plagiarism |
- Model:
roberta-base-openai-detector(HuggingFace Transformers) - Output: AI Generated / Human Written + Confidence Score
- Integrated directly into the results table for both personal and classroom uploads
Auto-generated reports include:
- Document filename & upload timestamp
- Plagiarism percentage + severity badge
- AI detection result + confidence score
- Highlighted matched phrases
- Downloadable directly from the Results page
- React.js β Component-based SPA
- Tailwind CSS β Utility-first styling
- Fetch API β REST communication
- Toast Notifications β User feedback
- FastAPI β High-performance Python API
- SQLAlchemy β ORM for database interaction
- PostgreSQL β Relational data storage
- JWT β Stateless authentication
- Scikit-learn β TF-IDF & cosine similarity
- HuggingFace Transformers β AI detection model
- SHA-256 Fingerprint Hashing β Custom plagiarism engine
- AWS S3 β Secure file storage
- ReportLab β PDF generation
EduShield/
βββ backend/
β βββ main.py # API routes & app entry point
β βββ models.py # SQLAlchemy database models
β βββ schemas.py # Pydantic request/response schemas
β βββ database.py # DB connection & session config
βββ frontend/
β βββ src/
β β βββ pages/ # Route-level page components
β β βββ components/ # Reusable UI components
β β βββ App.jsx # Root component & routing
β βββ public/
βββ uploads/ # Temporary file storage
βββ README.md
- Python 3.9+
- Node.js 18+
- PostgreSQL
- AWS S3 bucket (for file storage)
cd backend
python -m venv venv
venv\Scripts\activate # Windows
# source venv/bin/activate # macOS/Linux
pip install -r requirements.txt
uvicorn main:app --reloadAPI available at
http://localhost:8000
Interactive docs athttp://localhost:8000/docs
cd frontend
npm install
npm run devApp available at
http://localhost:3000
This repository is private. Access is granted only to collaborators explicitly added by the owner.
- Online corpus integration for broader plagiarism coverage
- Semantic plagiarism detection (beyond exact fingerprints)
- Admin analytics dashboard
- Multi-language document support
- Cloud deployment (AWS / Azure)
- Email notifications for deadlines and feedback
Samiksha Patil
BSc Information Technology β Final Year Project
π‘οΈ EduShield β Integrity, Verified.