Skip to content

krish9219/customer_complaint_resolution

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 
 
 

Repository files navigation

Customer_Complaint_Resolution

Customer resolutions is very important subject for any organisation. Customer satisfaction has impact on organisation's profit as well as reputation. In this project i will approach 2 methods for sloving this problem by using robust Machine Learning Models like Random Forest Model (RF) and Xtra Gradient Boosting Model (XGB).

In the first method i will choose only 2 columns and convert custome complaint resolution problem in to Auto tagging problem. As we did in StackOverflow tag prediction project. But, here we will use Machine Learning Algorithms instead of Deep Learning.

In the second method i will do the problem in hard approach (by using tfidi vectorizer, making more features from text etc.,) to predict the future customer resolutions based on there issues, products etc.

Libraries Used

import pandas as pd
import numpy as np
from textblob import TextBlob, Word
from nltk.stem import SnowballStemmer, WordNetLemmatizer
from nltk.corpus import stopwords
from sklearn.feature_extraction.text import TfidfVectorizer
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score, confusion_matrix
from sklearn.linear_model import LogisticRegression
from sklearn.naive_bayes import MultinomialNB
from sklearn.ensemble import RandomForestClassifier
from sklearn.svm import SVC
import xgboost
from xgboost import XGBClassifier

DATA

Column names = ['date_received', 'product', 'sub_product', 'issue', 'sub_issue', 'consumer_complaint_narrative', 'company_public_response', 'company', 'state', 'zipcode', 'tags', 'consumer_consent_provided', 'submitted_via', 'date_sent_to_company', 'company_response_to_consumer', 'timely_response', 'consumer_disputed?', 'complaint_id']

data shape = (555957, 7)

center

METHOD 1

Most repeated tag in the data is Mortrage.

center

Text Cleaning

center

Logistic Regression Model

center

Navie Bayes Model

center

Support Vector Machine Model

center

Random Forest Model

center

Xtra Gradient Boosting Model

center

Better Accuracy

Performance of XGB is better than other models. Using RamdomSearchCv will make model more generalise and give overall better efficiency.

METHOD 2

Text Cleaning

center

TFIDF Vectorizer_

center

Final Accuracy

center

In both the methods XGB performs slightly better than RF. But using RandomSearchCv could make the game changer.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors