Skip to content

sammmiksha/edushield

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

7 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸ›‘οΈ EduShield

AI-Powered Plagiarism & Authorship Detection System

FastAPI React PostgreSQL AWS S3 HuggingFace

A full-stack academic integrity platform built as a BSc IT Final Year Project


πŸ“– Overview

EduShield is a comprehensive academic integrity system that detects plagiarism and AI-generated content in student documents. It supports two distinct workflows β€” a classroom-based Faculty–Student pipeline for institutional use, and a Personal Document Checker for individual users β€” making it equally useful for educators and independent researchers.


✨ Features at a Glance

Feature Student Faculty Personal User
Upload Documents βœ… βœ… βœ…
Plagiarism Detection βœ… βœ… βœ…
AI Content Detection βœ… βœ… βœ…
Downloadable PDF Report β€” β€” βœ…
View Assignment Feedback βœ… β€” β€”
Manage Sections β€” βœ… β€”
Review Submissions β€” βœ… β€”

πŸš€ Key Modules

πŸ” Authentication & Role Management

  • Secure JWT-based authentication
  • Three distinct roles: Faculty, Student, Personal User
  • Role-based UI and API access control

πŸ‘©β€πŸ« Faculty Module

  • Create classroom sections with unique access codes
  • Upload reference assignments per section
  • View and review all student submissions
  • Detect:
    • Student-to-reference plagiarism
    • Student-to-student cross-comparison
  • Add remarks, grades, and resubmission requests

πŸ‘¨β€πŸŽ“ Student Module

  • Join sections via access code
  • Upload assignments (.pdf / .docx)
  • View plagiarism percentage and AI detection results
  • Track submission attempts, deadlines, and faculty feedback

πŸ“„ Personal Document Checker

Upload any academic document and get a detailed integrity report:

  • Domain-based analysis: Computer Science Β· Science Β· Commerce Β· General
  • Fingerprint-based plagiarism detection against a public domain corpus
  • AI content detection using a transformer model
  • Severity classification: Low Β· Medium Β· High
  • Highlighted matched phrases in results
  • Downloadable PDF report β€” professional and shareable

🧠 Detection Methodology

βœ” Fingerprint-Based Plagiarism Detection

Text Normalization β†’ Word Shingling β†’ SHA-256 Hashing β†’ Corpus Comparison β†’ Similarity Score
  1. Text is cleaned and broken into overlapping word shingles
  2. Each shingle is hashed using SHA-256
  3. Hashes are compared against a domain-specific public corpus
  4. Similarity is derived from the fingerprint overlap ratio

πŸ“Š Severity Thresholds

Similarity Score Severity Interpretation
≀ 15% 🟒 Low Common academic phrasing
16–40% 🟑 Medium Partial content resemblance
> 40% πŸ”΄ High Strong overlap β€” likely plagiarism

πŸ€– AI Content Detection

  • Model: roberta-base-openai-detector (HuggingFace Transformers)
  • Output: AI Generated / Human Written + Confidence Score
  • Integrated directly into the results table for both personal and classroom uploads

πŸ“‘ PDF Report Generation

Auto-generated reports include:

  • Document filename & upload timestamp
  • Plagiarism percentage + severity badge
  • AI detection result + confidence score
  • Highlighted matched phrases
  • Downloadable directly from the Results page

🧰 Tech Stack

Frontend

  • React.js β€” Component-based SPA
  • Tailwind CSS β€” Utility-first styling
  • Fetch API β€” REST communication
  • Toast Notifications β€” User feedback

Backend

  • FastAPI β€” High-performance Python API
  • SQLAlchemy β€” ORM for database interaction
  • PostgreSQL β€” Relational data storage
  • JWT β€” Stateless authentication

AI / ML

  • Scikit-learn β€” TF-IDF & cosine similarity
  • HuggingFace Transformers β€” AI detection model
  • SHA-256 Fingerprint Hashing β€” Custom plagiarism engine

Cloud & Utilities

  • AWS S3 β€” Secure file storage
  • ReportLab β€” PDF generation

πŸ“ Project Structure

EduShield/
β”œβ”€β”€ backend/
β”‚   β”œβ”€β”€ main.py           # API routes & app entry point
β”‚   β”œβ”€β”€ models.py         # SQLAlchemy database models
β”‚   β”œβ”€β”€ schemas.py        # Pydantic request/response schemas
β”‚   └── database.py       # DB connection & session config
β”œβ”€β”€ frontend/
β”‚   β”œβ”€β”€ src/
β”‚   β”‚   β”œβ”€β”€ pages/        # Route-level page components
β”‚   β”‚   β”œβ”€β”€ components/   # Reusable UI components
β”‚   β”‚   └── App.jsx       # Root component & routing
β”‚   └── public/
β”œβ”€β”€ uploads/              # Temporary file storage
└── README.md

βš™οΈ Local Setup

Prerequisites

  • Python 3.9+
  • Node.js 18+
  • PostgreSQL
  • AWS S3 bucket (for file storage)

Backend

cd backend
python -m venv venv
venv\Scripts\activate          # Windows
# source venv/bin/activate     # macOS/Linux

pip install -r requirements.txt
uvicorn main:app --reload

API available at http://localhost:8000
Interactive docs at http://localhost:8000/docs

Frontend

cd frontend
npm install
npm run dev

App available at http://localhost:3000


πŸ”’ Access & Licensing

This repository is private. Access is granted only to collaborators explicitly added by the owner.


πŸ“ˆ Roadmap

  • Online corpus integration for broader plagiarism coverage
  • Semantic plagiarism detection (beyond exact fingerprints)
  • Admin analytics dashboard
  • Multi-language document support
  • Cloud deployment (AWS / Azure)
  • Email notifications for deadlines and feedback

πŸ‘©β€πŸ’» Author

Samiksha Patil
BSc Information Technology β€” Final Year Project
πŸ›‘οΈ EduShield β€” Integrity, Verified.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors