We have already extracted the ratings and it is saved on our system in Text Analytics Part I. Now I will be classifying customers of MOTO-G into three categories:-
- who are highly Impressed with this phone (ratings given = 4 and 5)
- who are Satisfied with this phone (rating given = 3)
- who are not at all satisfied (rating given = 1 and 2)
Note: on flipkart max rating is 5
Count of customers in each category is as follows:
More than 50% customers are Impressed with this phone, which is amazing news for the company. However, company need to concentrate into 36% customers, their feedback, their views to know why they are dissatisfied with this phone and they can work upon dissatisfied feedback to improve and increase the satisfaction level among the customers.
Also, company can look into the “satisfied’ category people, how we can increase their satisfaction level so that they change their perception and rate as “Impressed’ category customers.
This was very simple classification using contingency table. I will do some classification using Support Vector Machine in next post. In this I have tried to classify “terms” on the basis of satisfaction which has two labels (two categories):
- Satisfied (rating given = 3, 4 and 5)
- Dissatisfied (rating given = 1 and 2)
R Code to do this:
#Analysis of ratings finalratings=gsub(" stars","",ratings) finalratings=gsub(" star","",finalratings) View(finalratings) finalratings1=as.numeric(finalratings) satisfaction=ifelse(finalratings1<=2,"Dissatisfied",ifelse(finalratings1==3,"Satisfied","Impressed!")) View(satisfaction) count_satis=as.data.frame(table(data1$satisfaction))
This is a chunk of code for creating contingency table of ratings with 3 categories of satisfaction, find full codes in R here.