Newsgroup-Classification

The data and its description is available in UCL data repository. The data has been made available in a format which can be used for classification. There are 2003 news articles belonging to 4 different categories. A naive bayes algorithmis implemented to classify each of the articles into one of news categories.In order to remove zero probabilities Laplace smoothening is done. To take care of underflow issues log likelihood function is used.

word.cpp is used to prepare a dictionary of all the words present in the data.

pre.cpp preprocesses the data and makes feature vector and label vector.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
NB.m		NB.m
README.md		README.md
datanew.txt		datanew.txt
getWordList.m		getWordList.m
pre.cpp		pre.cpp
predict.m		predict.m
start.m		start.m
testtry.txt		testtry.txt
word.cpp		word.cpp
words.txt		words.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Newsgroup-Classification

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Newsgroup-Classification

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages