Data Science, Product

Understanding Customer Frustrations in the Airline Industry with Aspect-based Sentiment Analysis

Every day, over 100,000 flights carry passengers to and from destinations all around the world, and it’s safe to say air travel brings out a fairly mixed bag of emotions in people. Through social media, customers now have a platform to say exactly what’s on their mind while they are traveling, creating a real-time stream of customer opinion on social networks.

If you follow this blog you’ll know that we regularly use Natural Language Processing to get insights into topical subjects ranging from the US Presidential Election to the Super Bowl ad battle. In this post, we thought it would be interesting to collect and analyze Tweets about airlines to see how passengers use Twitter as a platform to voice their opinion. We wanted to compare how often some of the better known airlines are mentioned by travelers on Twitter, what the general sentiment of those mentions were, and and how people’s sentiment varied when they were talking about different aspects of air travel.

Collecting Tweets

We chose five airlines, gathered 25,000 of the most recent Tweets mentioning them (from Friday, June 9). We chose the most recent Tweets in order to get a snapshot of what people were talking about in Tweets at any given time.

Airlines

The airlines we chose were:

  1. American Airlines – the largest American airline
  2. Lufthansa – the largest European airline
  3. Ryanair – a low-fares giant that is always courting publicity
  4. United Airlines – an American giant that is always (inadvertently) courting publicity
  5. Aer Lingus – naturally (we’re Irish).

Analysis

We’ll cover the following analyses:

  • Volume of tweets and mentions
  • Document-Level Sentiment Analysis
  • Aspect-based Sentiment Analysis

Tools used

Sentiment Analysis

Sentiment analysis, also known as opinion mining, allows us to use computers to analyze the sentiment of a piece of text. Essentially analyzing the sentiment of text allows us to get an idea of whether a piece of text is positive, negative or neutral.

For example, below is a chart showing the sentiment of Tweets we gathered that mentioned our target airlines.

This chart shows us a very high-level summary of people’s opinions towards each airline. You can see that the sentiment is generally more negative than positive, particularly in the case of the two US-based carriers, United and American. We can also see that negative Tweets account for a larger share of Ryanair’s Tweets than any other airline. While this gives us a good understanding of the public’s opinion about these certain airlines at the time we collected the tweets, it actually doesn’t tell us much about what exactly people were speaking positively or negatively about.

Aspect-based Sentiment Analysis digs in deeper

So sentiment analysis can tell us what the sentiment of a piece of text is. But text produced by people usually talks about more than one thing and often has more than one sentiment. For example, someone might write that they didn’t like how a car looked but did like how quiet it was, and a document-level sentiment analysis model would just look at the entire document and add up whether the overall sentiment was mostly positive or negative.

This is where Aspect-based Sentiment Analysis comes in, as it goes one step further and analyzes the sentiment attached to each subject mentioned in a piece of text. This is especially valuable since it allows you to extract richer insights about text that might be a bit complicated.

Here’s an example of our Aspect-based Sentiment Analysis demo analyzing the following piece of text: “This car’s engine is as quiet as hell. But the seats are so uncomfortable!”

absa screenshot 1

It’s clear that Aspect-based Sentiment Analysis can provide more granular insight into the polarity of a piece of text but another problem you’ll come across is context. Words mean different things in different contexts – for instance quietness in a car is a good thing, but in a restaurant it usually isn’t – and computers need help understanding that. With this in mind we’ve tailored our Aspect-based Sentiment Analysis feature to recognize aspects in four industries: restaurants, cars, hotels, and airlines.

So while the example above was analyzing the car domain, below is the result of an analysis of a review of a restaurant, specifically the text “It’s as quiet as hell in this restaurant”:

absa screenshot 2

Even though the text was quite similar to the car review, the model recognized that the words expressed a different sentiment because they were mentioned in a different context.

Aspect-based Sentiment Analysis in airlines

Now let’s see what we can find in the Tweets we collected about airlines. In the airlines domain, our endpoint recognizes 10 different aspects that people are likely to mention when talking about their experience with airlines.

absa airlines domain

Before we look at how people felt about each of these aspects, let’s take a look at which aspects they were actually talking about the most.

Noise is a big problem when you’re analyzing social media content. For instance when we analyzed our 25,000 Tweets, we found that almost two thirds had no mention of the aspects we’ve listed above. These Tweets mainly focused on things like online competitions, company marketing material or even jokes about the airlines. When we filtered these noisy Tweets out, we were left with 9,957 Tweets which mentioned one or more aspects.

The chart below shows which of the 10 aspects were mentioned the most.

On one hand it might come as a surprise to see aspects like food and comfort mentioned so infrequently – when you think about people giving out about airlines you tend to think of them complaining about food or the lack of legroom. On the other hand, it’s no real surprise to see aspects like punctuality and staff mentioned so much.

You could speculate that comfort and food are pretty standard across airlines (nobody expects a Michelin-starred airline meal), but punctuality can vary, so people can be let down by this (when your flight is late it’s an unpleasant surprise, which you would be more likely to Tweet about).

What people thought about each airline on key aspects

Now that we know what people were talking about, let’s take a look at how they felt. We’re going to look at how each airline performed on four interesting aspects:

  1. Staff – the most-mentioned aspect;
  2. Punctuality – to see which airline receives the best and worst sentiment for delays;
  3. Food – infrequently mentioned but a central part of the in-flight experience;
  4. Luggage – which airline gets the most Tweets about losing people’s luggage?

Staff

We saw in the mentions graph above that people mentioned staff the most when tweeting about an airline. You can see from the graph below that people are highly negative about airline staff in general, with a fairly equal level of negativity towards each airline except Lufthansa, which actually receives more positive sentiment than negative.


Punctuality

People’s second biggest concern was punctuality, and you can see below that the two US-based airlines score particularly bad on this aspect. Also, it’s worth noting that while Ryanair receives very negative sentiment in general, people complain about Ryanair’s punctuality less than any of the other airlines. This isn’t too surprising considering their exemplary punctuality record is one of their major USPs as an airline and something they like to publicize.


Food

We all know airline food isn’t the best, but when we looked at the sentiment about food in the Tweets, we found that people generally weren’t that vocal about their opinions on plane food. Lufthansa receives the most positive sentiment about this aspect, with their pretty impressive culinary efforts paying off. However it’s an entirely different story when it comes to the customer reaction towards United’s food, none of us have ever flown United here in the AYLIEN office, so from the results we got we’re all wondering what they’re feeding their passengers now.


Luggage

The last aspect that we compared across the airlines was luggage. When you take a look at the sentiment here, you can see that again Lufthansa perform quite well, but in this one Aer Lingus fares pretty badly. Maybe leave your valuables at home next time you fly with Ireland’s national carrier.

Ryanair and Lufthansa compared

So far we’ve shown just four of the 10 aspects our Aspect-based Sentiment Analysis feature analyzes in the airlines domain. To show all of them together, we decided to take two very different airlines and put them side by side to see how people’s opinions on each of them compared.

We picked Ryanair and Lufthansa so you can compare a “no frills” budget airline that focuses on short-haul flights, with a more expensive, higher-end offering and see what people Tweet about each.

First, here’s the sentiment that people showed towards every aspect in Tweets that mention Lufthansa.

Below is the same analysis of Tweets that mention Ryanair.

You can see that people express generally more positive sentiment towards Lufthansa than Ryanair.  This is no real surprise since this is a comparison of a budget airline with a higher-end competitor, and you would expect people’s opinions to differ on things like food and flight experience.

But it’s interesting to note the sentiment was actually pretty similar towards the two core aspects of air travel – punctuality and value.

The most obvious outlier here is the overwhelmingly negative sentiment about entertainment on Ryanair flights, especially since there is no entertainment on Ryanair flights. This spike in negativity was due to an incident involving drunk passengers on a Ryanair flight that was covered by the media on the day we gathered our Tweets, skewing the sentiment in the Tweets we collected. These temporary fluctuations are a problem inherent in looking at snapshot-style data samples, but from a voice-of-the-customer point of view they are certainly something an airline needs to be aware of.

This is just one example of how you can use our Text Analysis API to extract meaning from content at a large scale. If you’d like to use AYLIEN to extract insights from any text you have in mind, click on the image at the end of the post to get free access to the API and start analyzing your data. With the extensive documentation and how-to blogs, as well as detailed tutorials and a great customer support, you’ll have all the help you’ll need to get going in no time!





Text Analysis API - Sign up




Author


Avatar

Will Gannon

Content Marketing @ AYLIEN A Classics graduate from UCD, Will is on our Content Marketing Team here at AYLIEN. Before joining us, Will worked in research before completing a Master’s in Digital Humanities at Trinity College, where he used NLP methods to index where Latin terms appear in English Literature.