Paper
Document
Download
Flag content
0

Comparison Research on Text Pre-processing Methods on Twitter Sentiment Analysis

Published
Jan 1, 2017
Show more
Save
TipTip
Document
Download
Flag content
0
TipTip
Save
Document
Download
Flag content

Abstract

Twitter sentiment analysis offers organizations ability to monitor public feeling towards the products and events related to them in real time. The first step of the sentiment analysis is the text pre-processing of Twitter data. Most existing researches about Twitter sentiment analysis are focused on the extraction of new sentiment features. However, to select the pre-processing method is ignored. This paper discussed the effects of text pre-processing method on sentiment classification performance in two types of classification tasks, and summed up the classification performances of six pre-processing methods using two feature models and four classifiers on five Twitter datasets. The experiments show that the accuracy and F1-measure of Twitter sentiment classification classifier are improved when using the pre-processing methods of expanding acronyms and replacing negation, but barely changes when removing URLs, removing numbers or stop words. The Naive Bayes and Random Forest classifiers are more sensitive than Logistic Regression and support vector machine classifiers when various pre-processing methods were applied.

Paper PDF

Empty State
This PDF hasn't been uploaded yet.
Do not upload any copyrighted content to the site, only open-access content.
or