Sentiment Analysis On User Comment
When we look after the social medias the only thing that draws our attention are the users comments if it's our area of interest. What do you think why many great companies and organization's look after people sentiments and emotions. That's a pretty good question right ? So in this blog we will be getting a brief account on sentiment analysis taking an example of a users comment on different social media platforms to determine the interest of that particular user. Don't worry about the coding part this will be a step by step guide for you to reach the goal. we will be using python to perform sentiment analysis to analyse users reaction to different post and videos on different social media network specifically Facebook Twitter Instagram and YouTube.
At the end of the study we will be able to to conceptualize and measure the sentiment analysis and the polarity of comments to understand the user preferences bi training hour data using different data mining model specifically we will be using logistic regression Ne base text mining and sentiment analysis technique to be applied on the set of data. So lets get started.Before starting you should have a prior knowledge of python for easy going.
Sentiment Analysis
Sentiment analysis is a natural language processing and information that help us to analyze different types of emotion that a user I having while exploring different social media platforms for any specific post or video the emotions can be categorized as positive negative or neutral. Ever talked about positive it means a user like that post ever talk about negative maybe a user is not much interested or dislike that post while neutral seems to be neither liking nor disliking.
Let's take a small example say
This picture reminds me of my good old days , here are the users sentiments are positive
On the other hand this picture is ridiculous sounds negative isn't it so this is how we determine users emotions
The above example two major concepts that we need to know as polarity and magnitude. Polarity indicates whether the emotion as positive negative or neutral magnitude shows how strong the emotion exhibits.
The major concern is to extract the user opinion and behavior in comments and In user history I tried to construct the sentiment analysis algorithm using linear regression and naval base to support the cause. This also includes the polarity classification that is emerging techniques for understanding the comment patterns of the user as positive, negative and neutral. While doing the text Mining in sentiment analysis we will be combining both together in the blog for social media comments, the results are satisfactory and powerful descriptive or protective tools that can be successfully applied to extract the sentiments of the users from their comments through different social media platforms.
To make you more comfortable let's get a Quick guide on the steps that we are going to perform later in this blog
1. Data Access : Using user comment dataset to make a keyword search to access social media comments and tweets. Also using youtube comments using google API.
2. Data Cleaning : Doing structural data set , then, cleaning the data from stop words (non-functional), removing spaces, punctuation, URLs and performing stemming (get the root of the words). This step produces a structured representation of the comment dataset.
3. Data Analysis : The structured representation produced in the previous step enables performing Mining tasks such as performing sentiment analysis which uses a set of positive and negative words. A scoring function is used to find the polarity of comments and tweets.
4. Visualization : The wordcloud package and bar and line plots has been used to show the frequency of words in the customer comments and tweets.
5. Evaluation : By Performing logistic regression and naive bayes theorem to get the training and testing dataset evaluation to recognize any patterns and conclusions.
Lets begin the process !!
-
Fetch Youtube Comment
The first and basic step is to to fetch the data here we will be fetching the data of a YouTube video the link is given here we will be setting one page comments and limits 100 pages for this we have scrapped comment replies like weavers rating comment ID for the first page
YOUTUBE_IN_LINK =
https://www.googleapis.com/youtube/v3/commentThreads?part=snippet&maxResults=100&orde r=relevance&pageToken={pageToken}&videoId={videoId}&key={key}
Now using Page token we will repeat the loop for fetching comment till the dezired No. of comments we get , Remember this is totally your choice !
Here we get the Comment dataset and now we have stored this data in CSV
#Read Comment CSV File
Now we are reading the CSV file in which we have stored the comments
#Removing Emojis
Emojis are playing a vital role in this digital world to help us understand the actual emotions but encoding it will be of no use don't worry will be removing them for more accurate results as overcoat doesn't read emojis. People are found of using emojis , here we have removed them to maintain text mining as a concern
#Remove Punctuation
It helps to get rid of unhelpful parts of the data, or noise, by removing punctuations marks, and removing stop words and typos.
#Remove Another Special character form the comments
Text preprocessing is traditionally an important step for natural language processing (NLP) tasks.It transforms text into a more digestible form so that algorithms can perform better without special words.
# Tokenization and visualization of data set sentiment, word frequency , bigram frequency, Trigram Frequency
Calculating and visualizing relationships between words in your text dataset. We are examining pairs of two consecutive words, often called “bigrams”. trigrams, are consecutive sequences of 3 words
#Polarity Classification and Visualization
Using TextBlob to calculate sentiment polarity which lies in the range of [-1,1] where 1 means positive sentiment and -1 means a negative sentiment.
# Comment Tokenization for implementing Different Algorithms , and Extract new vocabulary
#Sentiment Classification Use an LR Algorithm for classification. Logistic Regression will help in predicting the probability of outcomes i.e. binary classification or multi-classification. That we will be studying in my next blog try to perform these steps you can even add your filters if you want to add on something
What's Your Reaction?