Text Analytics meets 2014 World Cup tweets – Part 2
In the first part of our World Cup 2014 blog series, we analyzed 30 million tweets collected between June 6th and July 14th about the biggest sporting event in the world, FIFA World Cup 2014, and we looked at some high-level associations and insights about the tournament: in a nutshell, we observed a repeating pattern of spikes appearing in tweet volumes around match times and important events. In the second part of the series, we’re going to dive deep into the tweets and analyze their content using our very own Text Analysis API and Rapidminer to get a more in-depth view of the data.
We’re using the same datasets that were used in part 1 (
tweets.csv) plus a new dataset called
tweets-sentiment.csv, which contains the sentiment polarity and subjectivity results obtained using our Sentiment Analysis API in
Top hashtags and mentions
Let’s start our analysis by finding the most popular hashtags and @ mentions from the tournament, by tokenizing tweets and sorting the tokens by frequency:
We’re now going to look into the polarity values (“positive” or “negative”) of these tweets to see what these values are for different entities and how they change over time, as a result of various events.
Note: we are only analyzing English tweets for the following examples, which introduces a sampling bias. The following charts and insights are based on the opinions of the English-speaking Twitter users.
Sentiment over time
Different events concerning players or a teams affect how people think and talk about them. Using polarity analysis, we can get an idea of people’s reaction to various events, which can provide valuable insight. Let’s look at two major talking points from the tournament as examples: Luis Suarez and the Brazil’s shocking performance.
1. On June 24th, Argentine Luis Suarez was largely accused of biting Italy defender Giorgio Chiellini, which was followed by a big wave of negative comments and feedback from Social Media. Suarez issued an apology on June 30th, which seems to have been satisfactory for the Twitter community (take note PR people!):
2. Brazil had arguably one of its worst performances in World Cup history during the 2014 tournament. This is pretty evident when you analyze the sentiment of tweets about #BRA after every lost match or controversial win:
Before the 3rd place playoff game between Brazil and Netherlands, people were hopeful that the catastrophic loss against Germany might bring the best out of the Brazilians. However, a few minutes into the game it’s pretty clear this was no longer the case:
Popularity by sentiment
We can use the average polarity measures for various entities to see how positively or negatively people talk about them.
Average polarity for the 16 teams that qualified for the second round:
Average polarity for top 10 scorers as well as two noteworthy players, Tim Howard of USA and Luis Suarez of Argentina:
Most ‘polar’ hashtags and mentions
Finally, let’s look at some of the most positive and negative hashtags and mentions:
Analyzing the sentiment of tweets gives an extraordinary view into the opinions of the public in relation to a certain topic or event. Listening to “social chatter” allows you to extract detailed insight into opinions and trends on brand, companies, events, football teams etc. and how they change over time, with say, the launch of a product, a company announcement, a crisis event or in the case above a footballer biting another player.
In Suarez’s case his “brand” took a major hit and “social chatter” about him turned pretty sour following the biting incident, however, his PR teams involvement and his deal at Barcelona allowed him to bounce back quite quickly, shown quite clearlyin the switch in polarity of tweets about him.
To learn more about Sentiment Analysis check out our recent blog posts. If you are Interested in analyzing the sentiment of text, tweets, comments or reviews you can get free access to our Text Analysis API.