Skip to content

srijanee/Newsgroup-Classification

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Newsgroup-Classification

The data and its description is available in UCL data repository. The data has been made available in a format which can be used for classification. There are 2003 news articles belonging to 4 different categories. A naive bayes algorithmis implemented to classify each of the articles into one of news categories.In order to remove zero probabilities Laplace smoothening is done. To take care of underflow issues log likelihood function is used.

word.cpp is used to prepare a dictionary of all the words present in the data.

pre.cpp preprocesses the data and makes feature vector and label vector.

About

The data and its description is available in UCL data repository. The data has been made available in a format which can be used for classification. There are 2003 news articles belonging to 4 different categories. A naive bayes algorithmis implemented to classify each of the articles into one of news categories.In order to remove zero probabili…

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors