Sentiment Analysis in Video Games with Large Language Model

Summary

WHAT

Sentiment Classification of comments
283,983 comments for 4,999 games from 19 platforms over 20 years (1998 to 2018)

HOW

Bert pretrained model

With an Accuracy 95% and a 1-point error this Model can classify each comment from 1 (Bad) to 5 (Good)

RESULTS

66% of the top are arcade/sport games
Top 3 good sentiment consoles: Playstation, Nintendo and Dreamcast

TECHNOLOGIES

VS Code

Python

Pandas

Pytorch

hugging_face

Hugging Face

seaborn

Seaborn

matplotlib

Matplotlib

Introduction


The public perception of your products is key to focusing marketing efforts. Particularly, quantifying the emotional reactions of customers after interacting with products and contrasting them with the competition could help highlight customer commitment to the company and identify strengths and weaknesses to take advantage of.

In this study, I delved into sentiment analysis in the gaming industry, which is one of the world's largest industries today. I used a Large Language Model to score the sentiment of comments people wrote for games released on Metacritics's webpage, considered a platform with an impact on purchasing decisions.

Methods


The database of 283,983 comments for 4,999 games from 19 platforms over 20 years (1998 to 2018) was used. Missing values for comments or game names were excluded.

The model used in this study was a Large Language Model, BERT, originally created by Google and then trained by Hugging Face community to classify sentiments into five categories, from 1 (very negative) to 5 (very positive). This pretrained model has demonstrated an accuracy of 95% with a prediction error of a maximum of 1 point from the real number, as stated by a human.

The Python library Pandas were used for data manipulation, Matplotlib and Seaborn for Visualization. Bert from Hugging Face was implemented using Python library Transformers from Pytorch.



Results


The general result for the Game Industry is shown in the figure below. We can see that the industry has more than 50% of comments with a score of 4 or 5, reflecting its positive reputation among consumers.



score_general



The ranking of the average sentiment score per platform can be interpreted as how good the relationship is between them and the customers. Some conclusions can be taken from this ranking.

From the number 1 (PlayStation) to the last one (PC) there is only 0.86 points, no big difference.

The big three trademarks are PlayStation, Nintendo 64, and Dreamcast, all from the 5th and 6th generations, which span from 1993 to 2005, in the midst of what some call the golden age of video games. From the modern consoles, only Nintendo Switch stands among the top 10.



score_platform

The pattern in the top 3 consoles is the high and always over 60% comments with the greatest sentiment classification. In Nintendo 64 and Dreamcast, the 4th classification decreases and the 1st increases, with less change in the middle ones.



score_ps
score_nd
score_dc

The top 10 games for these 3 consoles show how the type of games has a direct relation to the sentiment score: 20 out of the 30 games (66%) in this ranking are arcade/sport games (6 in PlayStation, 7 in Nintendo 64, and 7 in Dreamcast). The positive impact of this has a clear candidate for projects a company should invest in.



score_game_ps
score_game_nd
score_game_dc

Conclusion


PlayStation and Nintendo 64 have demonstrated to be good leaders in sales and appreciation from customers. Perhaps the less expected console in the top is Dreamcast since it didn't manage to sell as much as the previous ones, but still, some of its games evoke good feelings among the public. This could be a good sign to remake them.

These results could be used to prioritize projects, focusing on games about sports and arcades, with less risk than other types.

The use of Large Language Models could help extract information about people’s perceptions after using your products to make strategic decisions supported by one of the most important aspects of marketing, feelings.




© 2024 Bryan Morales