GitHub - komalkhatod1105/Fake_Review_Detection_Project

#  Fake Review Detection using NLP & Deep Learning

---

##  1. Introduction

In today's digital world, online reviews play an important role in influencing customer decisions. However, many platforms contain **fake or spam reviews** that mislead users.  

This project aims to build a system that can automatically classify reviews as **Real (Genuine)** or **Fake (Spam)** using **Natural Language Processing (NLP)** and **Machine Learning / Deep Learning techniques**.

---

##  2. Objectives

- Detect fake and genuine reviews automatically  
- Apply NLP techniques to process text data  
- Convert text into numerical features using TF-IDF  
- Train ML/DL models for classification  
- Evaluate model performance using standard metrics  

---

##  3. Dataset

### 🔹 Source:
Amazon Reviews Dataset  

### 🔹 Features:
- Review Text  
- Rating (1–5 stars)  
- Reviewer ID  
- Product ID  
- Verified Purchase  

### 🔹 Labels:
- **Real Review (0)**  
- **Fake Review (1)**  

---

##  4. System Architecture / Pipeline

Dataset → Preprocessing → Feature Extraction → Model Training → Testing → Prediction


---

##  5. Preprocessing (Text Cleaning)

Raw reviews are cleaned before processing.

### Steps:
- Convert text to lowercase  
- Remove stopwords (is, the, and, etc.)  
- Tokenization (split into words)  
- Stemming / Lemmatization  
- Remove punctuation and special characters  

### Example:
Input:

"This product is AMAZING!!!"


Output:

product amazing


---

##  6. Feature Extraction (TF-IDF)

TF-IDF converts text into numerical vectors.

### Concept:
- TF (Term Frequency): frequency of word in document  
- IDF (Inverse Document Frequency): importance of word  

### Formula:
TF-IDF = TF × IDF  

### Benefit:
- Important words get higher weight  
- Common words get lower weight  

---

##  7. Model Training

### Machine Learning Models:
- Logistic Regression  
- Naive Bayes  
- Support Vector Machine (SVM)  

### Deep Learning Models:
- Recurrent Neural Network (RNN)  
- Long Short-Term Memory (LSTM)  
- BERT (Bidirectional Encoder Representations from Transformers)  

### Training Process:
- Input: TF-IDF vectors  
- Output: Real / Fake label  
- Data split: 80% training, 20% testing  

---

##  8. Model Evaluation

### Metrics Used:
- Accuracy  
- Precision  
- Recall  
- F1-score  

### Example:
| Actual | Predicted | Result |
|--------|----------|---------|
| Fake   | Fake     |  Correct|
| Real   | Fake     |   Wrong |

---

## 9. Prediction Phase

The trained model predicts whether a new review is real or fake.

### Example:

Input:

"Excellent product!!! Must buy!!!"


Output:

Fake (Probability: 0.87)


Input:

"I used this product for 2 weeks, battery is good but camera is average"


Output:

Real (Probability: 0.91)


---

##  10. Detection Logic

### Fake Reviews:
- Repetitive words  
- Too many exclamation marks  
- Generic statements  
- No real experience  

### Real Reviews:
- Detailed explanation  
- Balanced opinion (pros + cons)  
- Natural writing style  

---

##  11. Technologies Used

- Python  
- Scikit-learn  
- TensorFlow / Keras  
- NLTK / SpaCy  
- Pandas  
- NumPy  

---

##  12. Project Structure


---

##  13. How to Run the Project

### Step 1: Clone Repository

git clone https://github.com/your-username/fake-review-detection.git


### Step 2: Install Dependencies

pip install -r requirements.txt


### Step 3: Run the Project

python main.py


---

##  14. Future Improvements

- Use BERT for higher accuracy  
- Add web interface (React / MERN stack)  
- Real-time review detection  
- Deploy on cloud (AWS / Heroku)  
- Add multilingual support  

---

##  15. Limitations

- Dataset quality affects accuracy  
- Fake reviews becoming more realistic  
- Model may misclassify borderline cases  

---

##  16. Conclusion

This project demonstrates how NLP and machine learning can be used to detect fake reviews effectively. It improves reliability in e-commerce platforms by filtering out spam reviews and helping users make better decisions.

---

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
app		app
data		data
outputs		outputs
src		src
venv		venv
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages