Best Buy is a large multinational consumer electronics corporation. Best Buy sells a wide range of consumer electronics and related merchandise, for example, software, computers, televisions, etc. Best Buy has stores across the country and an online store as well. Given the large range of products Best Buy encourages their customers to write reviews about the products. The review format is simple and easy to understand. The customer can rate the item from 1-5 stars, 5 being the highest rating, and then discuss what they liked or disliked about the product.
For this sentimental analysis I wanted to study Best Buy's television reviews. I used text mining and Naive Bayes classifier to determine if a review was positive or negative.
Everything was done with Python.
Data Processing
Collecting the data required some work. Even though Best Buy has APIs I had to used 3 different ones to get all the television review data. The APIs I used were Categories, Products, and Reviews. It was a trickling down process with different APIs. I used the Category API to obtain the Television Category ID, then I used Products API with the Category ID to obtain all the Television Product IDs, and finally I used the Reviews API with the Television Product IDs to obtain the reviews for each television product. When calling the Reviews API I had to loop through each page of a product and extract the reviews.
The problem at hand is an unsupervised problem, but I was able to transform it into a supervised problem by giving a class to certain reviews. The reviews rated 1 or 2 stars were labelled 'negative' and reviews rated 4 or 5 stars were labelled 'positive'. The reviews rated 3 stars were left unlabeled because the reviewer is usually on the fence about the product being positive or negative.
Before I trained a model I looked at the distribution of ratings.
| reviews_df.groupby('rating').size() | |
| #rating | |
| #1 1066 | |
| #2 842 | |
| #3 2824 | |
| #4 21199 | |
| #5 50792 |
| reviews_df.groupby('class').size() | |
| #class | |
| # 2824 | |
| #negative 1908 | |
| #positive 71991 |
As you can see there are very little negative reviews. I decided to under sample my positive reviews to match the amount of negative reviews. This will give a better model because if I trained a model with 97% positive reviews, it can predict all reviews to be positive and still have a low error rate.
I did some data cleaning by removing spaces and punctuation between words, removed stop words, and separated the reviews in to n-grams_range(1,3).
Predicting
I trained a Naive Bayes model with 60% training data and 40% testing data. First, I trained the model with the review body.
precision recall f1-score support
negative 0.96 0.19 0.32 761
positive 0.55 0.99 0.71 766
avg / total 0.76 0.59 0.52 1527
The accuracy score is 59.40%
I was getting a low accuracy score of 59%, so I decided to train the model on the review title instead.
precision recall f1-score support
negative 0.89 0.79 0.84 761
positive 0.82 0.91 0.86 766
avg / total 0.86 0.85 0.85 1527
The accuracy score is 85.07%
The accuracy score has greatly improved compared to the review body. One of the main reasons for this improvement is the volume of the review body. In the review body customers talked about the pros and cons of the product. The model could not differentiate from the different sentiments of each review body unless heavily one sided. But, by training a model with the title it can easily tell if the review was negative or positive, because of stronger and less words.
Finally I used my model with an accuracy of 85% to predict if reviews rated 3 stars were positive or negative.
Code
https://github.com/moyphilip/BestBuy