Good Contents Are Everywhere, But Here, We Deliver The Best of The Best.Please Hold on!
Your address will show here +12 34 56 78


Here at AYLIEN we spend our days creating cutting-edge NLP and Text Analysis solutions such as our Text Analysis API and News API to help developers build powerful applications and processes.

We understand, however, that not everyone has the programming knowledge required to use APIs, and this is why we created our Text Analysis Add-on for Google Sheets – to bring the power of NLP and Text Analysis to anyone who knows how to use a simple spreadsheet.

Today we want to show you how you can build an intelligent sentiment analysis tool with zero coding using our Google Sheets Add-on and a free service called IFTTT.

Here’s what you’ll need to get started;

What is IFTTT?

IFTTT stands for If This, Then That. It is a free service that enables you automate specific tasks by triggering actions on apps when certain criteria is met. For example, “if the weather forecast predicts rain tomorrow, notify me by SMS”.

Step 1 – Connect Google Drive to IFTTT

  • Log in to your IFTTT account
  • Search for, and select, Google Drive
  • Click Connect and enter your Google login information

Step 2 – Create Applets in IFTTT

Applets are the processes you create to trigger actions based on certain criteria. It’s really straightforward. You define the criteria (the ‘If’) and then the trigger (the ‘That’). In our previous weather-SMS example, the ‘if’ is a rain status within a weather app, and the ‘that’ is a text message that gets sent to a specified cell phone number.

To create an applet, go to My Applets and click New Applet.

Here’s what you’ll see. Click the blue +this

Screen Shot 2016-12-02 at 14.48.41

You will then be shown a list of available apps. In this case, we want to source specific tweets, so select the Twitter app.

You will then be asked to choose a trigger. Select New tweet from search.

You can now define exactly what tweets you would like to source, based on their content. You can be quite specific with your search using Twitter’s search operators, which we’ve listed below;

Twitter search operators

To search for specific words, hashtags or languages

  • Tweets containing all words in any position (“Twitter” and “search”)  
  • Tweets containing exact phrases (“Twitter search”)
  • Tweets containing any of the words (“Twitter” or “search”)
  • Tweets excluding specific words (“Twitter” but not “search”)
  • Tweets with a specific hashtag (#twitter)
  • Tweets in a specific language (written in English)

To search for specific people or accounts

  • Tweets from a specific account (Tweeted by “@TwitterComms”)
  • Tweets sent as replies to a specific account (in reply to “@TwitterComms”)
  • Tweets that mention a specific account (Tweet includes “@TwitterComms”)

To exclude Retweets and/or links

  • To exclude Retweets (“-rt”)
  • To exclude links/URLs (“-http”) and (“-https”)

Our first trigger

We’re going to search for tweets that mention “bad santa 2 is” or “bad santa 2 was”. Why are we searching for these terms? Well, we find that original, opinionated tweets generally use either one of these phrases. It also helps to cut out tweets that contain no opinion (neutral sentiment) such as the one below;


Our goal with this tool is to analyze the viewer reaction to “Bad santa 2”  which means Tweets such as this one aren’t entirely interesting to us in this case. However, if we wanted to asses the overall buzz on Twitter about Bad Santa 2 perhaps we might just look for any mention at all and concentrate on the volume of tweets.

And so, here’s our first trigger.

Screen Shot 2016-12-07 at 10.53.35

Click Create Trigger when you’re happy with your search. You will then see the following;

Screen Shot 2016-12-01 at 17.35.29Notice how the Twitter icon has been added. Now let’s choose our action. Click the blue +that

Next, search for or select Google Drive. You will then be given 4 options – select Add row to spreadsheet. This action will add each matching tweet to an individual row in Google Sheets.

Next, give the spreadsheet a name. We simply went for ‘Bad Santa 2’. Click Create Action. You will then be able to review your applet. Click Finish when you are happy with it.

Done! Tweets that match your search criteria will start appearing in an auto-generated Google Sheet within minutes. Now you can go through this process again to create a second applet. We chose another movie, Allied. (“Allied was” or “Allied is”).

Here is an example of what you can expect to see accumulate in your Google Sheet;

Screen Shot 2016-12-02 at 17.59.39

Note: When you install our Google Sheets Add-on we’ll give 1,000 credits to use for free. You then have the option to purchase additional credits should you wish to. For this example, we will stay within the free range and analyze 500 tweets for each movie. You may choose to use more or less, depending on your preference.

Step 3 – Clean your data

Because of the nature of Twitter, you’re probably going to find a lot of crap and spammy tweets in your spreadsheet. To minimize the amount of these tweets that end up in your final data set, there are a few things we recommend you do;

Sort your tweets alphabetically

By sorting your tweets alphabetically, you can quickly scroll down through your spreadsheet and easily spot multiples of the same tweet. It’s a good idea to delete multiple instances of the same tweet as they will not only skew your overall results but multiple instances of the same tweet can often point to bot activity or spamming activity on Twitter. To sort your tweets alphabetically, select the entire column, select Data and Sort sheet by column B, A-Z.

AYLIEN Start Analysis

Remove retweets (if you haven’t already done so)

Alphabetically sorting your tweets will also list all retweets together (beginning with RT). You may or may not want to include retweets, but this is entirely up to you. We decided to remove all retweets because there are so many bots out there auto-retweeting and we felt that using this duplicate content isn’t exactly opinion mining.

Search and filter certain words

Think about the movie(s) you are searching for and how their titles may be used in different contexts. For example, we searched for tweets mentioning ‘Allied’, and while we used Twitter’s search operators to exclude words like forces, battle and treaty, we noticed a number of tweets about a company named ‘Allied’. By searching for their company Twitter handle, we could highlight and delete the tweets in which they were mentioned.

NB: Remove movie title from tweets

Before you move on to Step 4 and analyze your tweets, it is important to remove the movie title from each tweet, as it may affect the quality of your results. For example, our tweet-level sentiment analysis feature will read ‘Bad Santa 2…” in a tweet and may assign negative sentiment because of the inclusion of the word bad.

To remove all mentions of your chosen movie title, simply use EditFind and Replace in Google Sheets.

Step 4 – Analyze your tweets

Now comes the fun part! It’s time to analyze your tweets using the AYLIEN Text Analysis Add-on. If you have not yet installed the Add-on, you can do see here.

Using our Add-on couldn’t be easier. Simply select the column containing all of your tweets, then click Add-onsText Analysis.

Select sentiment

To find out whether our tweets have been written in a positive, neutral or negative way, we use Sentiment Analysis.

Note: While Sentiment Analysis is a complex and fascinating field in NLP and Machine Learning research, we won’t get into it in too much detail here. Put simply, it enables you to establish the sentiment polarity (whether a piece of text is positive, negative or neutral) of large volumes of text, with ease.

Next, click the drop-down menu and select Sentiment AnalysisAnalyze.

Each tweet will then be analyzed for subjectivity (whether it is written subjectively or objectively) and sentiment polarity (whether it is written in a positive, negative or neutral manner). You will also see a confidence score for both subjectivity and sentiment. This tells you how confident we are that the assigned label (positive, negative, objective, etc) is correct.

By repeating this process for our
Allied tweets, we can then compare our results and find out which movie has been best received by Twitter users.

Step 5 – Compare & visualize

In total we analyzed 1,000 tweets, 500 for each movie. Through a simple count of positive, negative and neutral tweets, we received the following results;

Bad Santa 2

Positive – 170

Negative – 132

Neutral – 198


Positive – 215

Negative – 91

Neutral – 194

Now to generate a percentage score for each movie. Let’s start by excluding all neutral tweets. We can then easily figure out what percentage of remaining tweets are positive. So, for Allied, of the remaining 306 tweets, 215 were positive,giving us a positive score of 70%.

By doing the same with Bad Santa 2, we get 56%.

Allied wins!

To visualize your results, use your tweet volume data to generate some charts and graphs in Google Sheets;

piechartsComparing our results with Rotten Tomatoes & IMDb

It’s always interesting to compare results of your analysis with those of others. To compare ours, we went to the two major movie review site – Rotten Tomatoes & IMDb, and we were pleasantly surprised with the similarity in our results!


The image below from Rotten Tomatoes shows both critic (left) and audience (right) score for Allied. Seeing as we analyzed tweets from a Twitter audience, we are therefore more interested in the latter. Our score of 70% comes so close to that of almost 15,000 reviewers on Rotten Tomatoes – just 1% off!

Screen Shot 2016-12-07 at 15.54.03

IMDb provide an audience-based review score of 7.2/10. Again, very close to our own result.

Screen Shot 2016-12-07 at 16.08.46

Our result for Bad Santa 2, while not as close as that of Allied, was still pretty close to Rotten Tomatoes with 56%.

Screen Shot 2016-12-07 at 15.54.24

With IMDb, however, we once again come within 1% with a score of 5.7/10.

Screen Shot 2016-12-07 at 16.09.04


We hope that this simple and fun use-case using our Google Sheets Add-on will give you an idea of just how useful, flexible and simple Text Analysis can be, without the need for any complicated code.

While we decided to focus on movie reviews in this example, there are countless other uses for you to try. Here’s a few ideas;

  • Track mentions of brands or products
  • Track event hashtags
  • Track opinions towards election candidates

Ready to get started? Click here to install our Text Analysis Add-on for Google Sheets.

Text Analysis API - Sign up



Dubbed as Europe’s largest technology marketplace and Davos for geeks, the Web Summit has been going from strength to strength in recent years as more and more companies, employees, tech junkies and media personnel flock to the annual event to check out the latest innovations, startups and a star-studded lineup of speakers and exhibitors.


Having grown from a small gathering of around 500 like-minded people in Dublin, this year’s event, which was held in Lisbon for the first time, topped 50,000 attendees representing 15,000 companies from 166 countries.

With such a large gathering of techies, there was bound to be a whole lot of chatter relating to the event on Twitter. So being the data geeks that we are, and before we jetted off to Lisbon ourselves, we turned our digital ears to Twitter and listened for the duration of the event to see what we could uncover.

Our process

We collected a total of just over 80,000 tweets throughout the event by focusing our search on keywords, Twitter handles and hashtags such as ‘Web Summit’, #websummit, @websummit, etc.

We used the following tools to collect, analyze and visualize the data;

And here’s what we found;

What languages were the tweets written in?

In total, we collected tweets written in 42 different languages.

Out of our 80,000 tweets, 60,000 were written in English, representing 75% of the total volume.

The pie chart below shows all languages, excluding English. As you can see, Portuguese was the next most-used language with just under of 11% of tweets being written in the host country’s native tongue. Spanish and French tweets represented around 2.5% of total volume each.

How did tweet volumes fluctuate throughout the week?

The graph below represents hourly tweet volume fluctuations throughout the week. As you can see, there are four distinct peaks.

While we can’t list all the reasons for these spikes in volume, we did find a few recurring trends during these times, which we have added to the graph;

Let’s now take a more in-depth look at each peak.

What were the causes of these fluctuations?

By adding the average hourly sentiment polarity to this graph we can start to gather a better understanding of how people felt while writing their tweets.

Not familiar with sentiment analysis? This is a feature of text analysis and natural language processing (NLP) that is used to detect positive or negative polarity in text. In short, it tells us whether a piece of text, or a tweet in this instance, has been written in a positive, negative or neutral way. Learn more.

Interestingly, each tweet volume peak correlates with a sharp drop in sentiment. What does this tell us? People were taking to Twitter to complain!

Positivity overall

Overall, average sentiment remained in the positive (green) for the entire week. That dip into negative (red) that you can see came during the early hours of Day 2 as news of the US election result broke. Can’t blame the Web Summit for that one!

We can also see distinct rises in positive sentiment around the 5pm mark each day as attendees took to Twitter to reflect on an enjoyable day.

Sentiment also remained comparatively high during the later hours of each day as the Web Summit turned to Night Summit – we’ll look at this in more detail later in the post.


Mike, Afshin, Noel & Hamed after a hectic but enjoyable day at the Web Summit

What was the overall sentiment of the tweets?

The pie chart below shows the breakdown of all 80,000 tweets, split by positive, negative and neutral sentiment.

The majority of tweets (80%) were written in a neutral manner. 14% were written with positive sentiment, with the remaining 6% written negatively.

To uncover the reasons behind both the positive and negative tweets, we extracted and analyzed mentioned keywords to see if we could spot any trends.

What were the most common keywords found in positive tweets?

We used our Entity and Concept Extraction features to uncover keywords, phrases, people and companies that were mentioned most in both positive and negative tweets.

As you can imagine, there were quite a few keywords extracted from 80,000 tweets so we trimmed it down by taking the following steps;

  • Sort by mention count
  • Take the top 100 most mentioned keywords
  • Remove obvious or unhelpful keywords (Web Summit, Lisbon, Tech, etc)

And here are our results. You can hover over individual clusters to see more information.

We can see some very positive phrases here, with great, amazing, awesome, good, love and nice featuring prominently.

The most mentioned speaker from the positive tweets was Gary Vaynerchuk (@garyvee), which makes sense considering the sharp rise in positive sentiment we saw his fans produce earlier in this post on our sentiment-over-time graph.

What were the most common keywords found in negative tweets?

We took the exact same approach to generate a list of the most mentioned keywords from tweets with negative sentiment;

For those of you that attended Web Summit, it will probably come as no surprise to see WiFi at the forefront of the negativity. While it did function throughout the event, many attendees found it unreliable and too slow, leading to many using their own data and hotspotting from their cell phones.

Mentions of queue, long, full, lines and stage are key indicators of just how upset people became while queueing for the opening ceremony at the main stage, only for many to be turned away because the venue became full.

The most mentioned speaker from negative tweets was Dave McClure (@davemcclure). The 500 Startups Founder found himself in the news after sharing his views on the US election result with an explosive on-stage outburst. It should be noted that just because Dave was the most mentioned speaker from all negative tweets, it doesn’t necessarily mean people were being negative towards him. In fact, many took to Twitter to support him;

Much of the negativity came from people simply quoting what Dave had said on stage, which naturally contained high levels of negative sentiment;

Which speakers were mentioned most?

Web Summit 2016 delivered a star-studded line up of a total of 663 speakers. What we wanted to know who was, who was mentioned most on Twitter?

By combining mentions of names and Twitter handles, we generated and sorted a list of the top 25 most mentioned speakers.

Messrs Vaynerchuk and McClure once again appear prominently, with the former being the most mentioned speaker overall throughout the week. Joseph Gordon-Levitt, actor and Founder of HitRECord, came in in second place, followed by Web Summit founder Paddy Cosgrave.

Which airline flew to Lisbon with the happiest customers?

With attendees visiting Lisbon from 166 countries, we thought it would be cool to see which airline brought in the happiest customers. By extracting mentions of the airlines that fly in to Lisbon, we could then analyze the sentiment of the tweets in which they were mentioned.

For most airlines, there simply wasn’t enough data available to analyze. However, we did find enough mentions of Ryanair and British Airways to be able to analyze and compare.

Here’s what we found;

Ryanair vs. British Airways

The graph below is split into three levels of sentiment – positive, neutral and negative. Ryanair is represented in blue and British Airways in red.

It’s really not hard to pick a winner here. British Airways were not only mentioned in more positive tweets, they were also mentioned in considerably less negative tweets.

Night Summit: which night saw the highest tweet volumes?

In total we found 593 mentions of night summit. The graph below shows tweet volumes for each day, and as you can see, November 7 was a clear winner in terms of volume.

..and which morning saw the most hangovers?!

Interestingly, we found a correlation between low tweet volumes (mentioning Night Summit, #nightsummit, etc.) and higher mentions of hangovers the following day!

59% of tweets mentioning hangover, hungover, resaca, etc, came on November 10 – the day after the lowest tweet volume day.

35% came on November 9 while just 6% came on November 8 – the day after the highest tweet volume day.

What do these stats tell us? Well, while we can’t be certain, we’re guessing that the more people partied, the less they tweeted. Probably a good idea 🙂


In today’s world, if someone wants to express their opinion on an event, brand, product, service, or anything really, they will more than likely do so on social media. There is a wealth of information published through user generated content that can be accessed in near real-time using Text Analysis and Text Mining solutions and techniques.

Wanna try it for yourself? Click the image below to sign up to our Text Analysis API with 1,000 free calls per day.

Text Analysis API - Sign up



In recent months, we have been bolstering our sentiment analysis capabilities, thanks to some fantastic research and work from our team of scientists and engineers.

Today we’re delighted to introduce you to our latest feature, Sentence-Level Sentiment Analysis.

New to Sentiment Analysis? No problem. Let’s quickly get you up to speed;

What is Sentiment Analysis?

Sentiment Analysis is used to detect positive or negative polarity in text. Also known as opinion mining, sentiment analysis is a feature of text analysis and natural language processing (NLP) research that is increasingly growing in popularity as a multitude of use-cases emerge. Here’s a few examples of questions that sentiment analysis can help answer in various industries;

  • Brands – are people speaking positively or negatively when they mention my brand on social media?
  • Hospitality – what percentage of online reviews for my hotel/restaurant are positive/negative?
  • Finance – are there negative trends developing around my investments, partners or clients?
  • Politics – which candidate is receiving more positive media coverage in the past week?

We could go on and on with an endless list of examples but we’re sure you get the gist of it. Sentiment Analysis can help you understand the split in opinion from almost any body of text, website or document – an ideal way to uncover the true voice of the customer.

Types of Sentiment Analysis

Depending on your specific use-case and needs, we offer a range of sentiment analysis options;

Document Level Sentiment Analysis

Document level sentiment analysis looks at and analyzes a piece of text as a whole, providing an overall sentiment polarity for a body of text.

For example, this camera review;

Screen Shot 2016-11-22 at 17.56.07

receives the following result;

Screen Shot 2016-11-22 at 17.56.14

Want to test your own text or URLs? Check out our live demo.

Aspect-Based Sentiment Analysis (ABSA)

ABSA starts by locating sentences that relate to industry-specific aspects and then analyzes sentiment towards each individual aspect. For example, a hotel review may touch on comfort, staff, food, location, etc. ABSA can be used to uncover sentiment polarity for each aspect separately.

Here’s an example of results obtained from a hotel review we found online;

Screen Shot 2016-11-22 at 17.58.05

Note how each aspect is automatically extracted and then given a sentiment polarity score.

Click to learn more about Aspect-Based Sentiment Analysis.

Sentence-Level Sentiment Analysis (SLSA)

Our latest feature breaks down a body of text into sentences and analyzes each sentence individually, providing sentiment polarity for each.

SLSA in action

Sentence-Level Sentiment Analysis is available in our Google Sheets Add-on and also through the ABSA endpoint in our Text Analysis API. Here’s a sample query to try with the Text Analysis API;

Now let’s take a look at it in action in the Sheets Add-on.

Analyze text

We imported some hotel reviews into Google Sheets and then ran an analysis using our Text Analysis Add-on. Below you will see the full review in column A, and then each sentence in a column of its own with a corresponding sentiment polarity (positive, negative or neutral), as well as a confidence score. This score reflects how confident we are that the sentiment is correct, with 1.0 representing complete confidence.

Screen Shot 2016-11-23 at 17.54.55

Analyze URLs

This new feature also enables you to analyze volumes of URLs as it first scrapes the main text content from each web page and then runs SLSA on each sentence individually.

In the GIF below, you can see how the content from a URL on Business Insider is first broken down into individual sentences and then assigned a positive, negative or neutral sentiment at sentence level, thus providing a granular insight into the sentiment of an article.


What’s the benefit of SLSA?

As we touched on earlier, sentiment analysis, in general, has a wide range of potential use-cases and benefits. However, Document-Level Sentiment Analysis can often miss out on uncovering granular details in text by only providing an overall sentiment score.

Sentence-Level Sentiment Analysis allows you to perform a more in-depth analysis of text by uncovering the positive, neutral and negatively written sentences to find the root causes of the overall document-level polarity. It can assist you in locating instances of strong opinion in a body of text, providing greater insight into the true thoughts and feelings of the author.

SLSA can also be used to analyze and summarize a collection of online reviews by extracting all the individual sentences within them that are written with either positive or negative sentiment.

Ready to get started?

Our Text Analysis Add-on for Google Sheets has been developed to help people with little or no programming knowledge take advantage of our Text Analysis capabilities. If you are in any way familiar with Google Sheets or MS Excel you will be up and running in no time. We’ll even give you 1,000 free credits to play around with. Click here to download your Add-on or click the image below to get started for free with our Text Analysis API.


Text Analysis API - Sign up



The 2016 US Presidential election was one of (if not the) most controversial in the nation’s history. With the end prize being arguably the most powerful job in the world, the two candidates were always going to find themselves coming under intense media scrutiny. With more media outlets covering this election than any that have come before it, an increase in media attention and influence was a given.

But how much of an influence does the media really have on an election? Does journalistic bias sway voter opinion, or does voter opinion (such as poll results) generate journalistic bias? Does the old adage “all publicity is good publicity” ring true at election time?

“My sense is that what we have here is a feedback loop. Does media attention increase a candidate’s standing in the polls? Yes. Does a candidate’s standing in the polls increase media attention? Also yes.” -Jonathan Stray @jonathanstray

Thanks to an ever-increasing volume of media content flooding the web, paired with advances in natural language processing and text analysis capabilities, we are in a position to delve deeper into these questions than ever before, and by analyzing the final sixty days of the 2016 US Presidential election, that’s exactly what we set out to do.

So, where did we start?

We started by building a very simple search using our News API to scan thousands of monitored news sources for articles related to the election. These articles, 170,000 in total, were then indexed automatically using our text analysis capabilities in the News API.

This meant that key data points in those articles were identified and indexed to be used for further analysis:

  • Keywords
  • Entities
  • Concepts
  • Topics

With each of the articles or stories sourced comes granular metadata such as publication time, publication source, source location, journalist name and sentiment polarity of each article. Combined, these data points provided us with an opportunity to uncover and analyze trends in news stories relating to the two presidential candidates.

We started with a simple count of how many times each candidate was mentioned from our news sources in the sixty days leading up to election day, as well as the keywords that were mentioned most.


By extracting keywords from the news stories we sourced, we get a picture of the key players, topics, organizations and locations that were mentioned most. We generated the interactive chart below using the following steps;

  1. We called the News API using the query below.
  2. We called it again, but searched for “Trump NOT Clinton”
  3. Mentions of the two candidates naturally dominated in both sets of results so we removed them in order to get a better understanding of the keywords that were being used in articles written about them. We also removed some very obvious and/or repetitive words such as USA, America, White House, candidate, day, etc.

Here’s the query;

You can hover your cursor over each cluster to view details;

Most mentioned keywords in articles about Hillary Clinton

Straight away, bang in the middle of these keywords, we can see FBI and right beside it, emails.

Most mentioned keywords in articles about Donald Trump

Similar to Hillary, Trump’s main controversies appear most prominently in his keywords, with terms like women, video, sexual and assault all appearing prominently.

Most media mentions

If this election was decided by the number of times a candidate was mentioned in the media, who would win? We used the following search queries to total the number of mentions from all sources over the sixty days immediately prior to election day;

Note: We could also have performed this search with a single query, but we wanted to separate the candidates for further analysis, and in doing this, we removed overlapping stories with titles that mentioned both candidates.

Here’s what we found, visualized;

Who was mentioned more in the media? Total mentions volume:

It may come as no surprise that Trump was mentioned considerably more than Clinton during this period, but was he consistently more prominent in the news over these sixty days, or was there perhaps a major story that has skewed the overall results? By using the Time Series endpoint, we can graph the volume of stories over time.

We generated the following chart using results from the two previous queries;

How media mentions for both candidates fluctuated in the final 60 days

As you would expect, the volume of mentions for each candidate fluctuates throughout the sixty day period, and to answer our previous question – yes, Donald Trump was consistently more prominent in terms of media mentions throughout this period. In fact, he was mentioned more than Hillary Clinton in 55 of the 60 days.

Let’s now take a look at some of the peak mention periods for each candidate to see if we can uncover the reasons for the spikes in media attention;

Donald Trump

Trump’s peak period of media attention was October 10-13, as indicated by the highest red peak in the graph above. This period represented the four highest individual days of mention volume and can be attributed to the scandal that arose from sexual assault accusations and a leaked tape showing Trump making controversial comments about groping women.

The second highest peak, October 17-20, coincides with a more positive period for Trump, as a combination of a strong final presidential debate and a growing email scandal surrounding Hillary Clinton increased his media spotlight.

Hillary Clinton

Excluding the sharp rise in mentions just before election day, Hillary’s highest volume days in terms of media mentions occurred from October 27-30 as news of the re-emergence of an FBI investigation surfaced.

So we’ve established the dates over the sixty days when each candidate was at their peak of media attention. Now we want to try establish the sentiment polarity of the stories that were being written about each candidate throughout this period. In other words, we want to know whether stories were being written in a positive, negative or neutral way. To achieve this, we performed Sentiment Analysis.

Sentiment analysis

Sentiment Analysis is used to detect positive or negative polarity in text. Also known as opinion mining, sentiment analysis is a feature of text analysis and natural language processing (NLP) research that is increasingly growing in popularity as a multitude of use-cases emerge. Put simply, we perform Sentiment Analysis to uncover whether a piece of text is written in a positive, negative or neutral manner.

Note: The vast majority of news articles about the election will undoubtedly contain mentions of both Trump and Clinton. We therefore decided to only count stories with titles that mentioned just one candidate. We believe this significantly increases the likelihood that the article was written about that candidate. To achieve this, we generated search queries that included one candidate while excluding the other. The News API supports boolean operators, making such search queries possible.

First of all, we wanted to compare the overall sentiment of all stories with titles that mentioned just one candidate. Here are the two queries we used;

And here are the visualized results;

What am I seeing here? Blue represents articles written in a neutral manner, red in a negative manner and green in a positive manner. Again, you can hover over the graph to view more information.

What was the overall media sentiment towards Hillary Clinton?

What was the overall media sentiment towards Donald Trump?

Those of you that followed the election, to any degree, will probably not be surprised by these results. We don’t really need data to back up the claim that Trump ran the more controversial campaign and therefore generated more negative press.

Again, similar to how we previously graphed mention volumes over time, we also wanted to see how sentiment in the media fluctuated throughout this sixty day period. First we’ll look at Clinton’s mention volume and see if there is any correlation between mention volume and sentiment levels.

Hillary Clinton

How to read this graph: The top half (blue) represents fluctuations in the number of daily media mentions (‘000’s) for Hillary Clinton. The bottom half represents fluctuations in the average sentiment polarity of the stories in which she was mentioned. Green = positive and red = negative.

You can hover your cursor over the data points to view more in-depth information.

Mentions Volume (top) vs. Sentiment (bottom) for Hillary Clinton

From looking at this graph, one thing becomes immediately clear; as volume increases, polarity decreases, and vice versa. What does this tell us? It tells us that perhaps Hillary was in the news for the wrong reasons too often – there were very few occasions when both volume and polarity increased simultaneously.

Hillary’s average sentiment remained positive for the majority of this period. However, that sharp dip into the red circa October 30 came just a week before election day. We must also point out the black line that cuts through the bottom half of the graph. This is a trend line representing average sentiment polarity and as you can see, it gets consistently closer to negative as election day approaches.

Mentions Volume (top) vs. Sentiment (bottom) for Donald Trump

Trump’s graph paints a different picture altogether. There was not a single day when his average polarity entered into the positive (green). What’s interesting to note here, however, is how little his mention volumes affected his average polarity. While there are peaks and troughs, there were no major swings in either direction, particularly in comparison to those seen on Hillary’s graph.

These results are of course open to interpretation, but what is becoming evident is that perhaps negative stories in the media did more damage to Clinton’s campaign than they did to Trump’s. While Clinton’s average sentiment polarity remained consistently more positive, Trump’s didn’t appear to be as badly affected when controversial stories emerged. He was consistently controversial!

Trumps lowest point, in terms of negative press, came just after the second presidential debate at the end of September. What came after this point is the crucial detail, however. Trump’s average polarity recovered and mostly improved for the remainder of the campaign. Perhaps critically, we see his highest and most positive averages of this period in the final 3 weeks leading up to election day.

Sentiment from sources

At the beginning of this post we mentioned the term media bias and questioned its effect on voter opinion. While we may not be able to prove this effect, we can certainly uncover any traces of bias from media content.

What we would like to uncover is whether certain sources (ie publications) write more or less favorably about either candidate.

To test this, we’ve analyzed the sentiment of articles written about both candidates from two publications: USA Today and Fox News.

USA Today


Similar to the overall sentiment (from all sources) displayed previously, the sentiment polarity of articles from USA Today shows consistently higher levels of negative sentiment towards Donald Trump. The larger than average percentage of neutral results indicate that USA Today took a more objective approach in their coverage of the election.

USA Today – Sentiment towards Hillary Clinton

USA Today – Sentiment towards Donald Trump

Fox News

Again, Trump dominates in relation to negative sentiment from Fox News. However, what’s interesting to note here is that Fox produced more than double the percentage of negative story titles about Hillary Clinton than USA Today did. We also found that, percentage-wise, they produced half as many positive stories about her. Also, 3.9% of Fox’s Trump coverage was positive, versus USA Today’s 2.5%.

Fox News – Sentiment towards Hillary Clinton

Fox News – Sentiment towards Donald Trump

Media bias?

These figures beg the question; how are two major news publications writing about the exact same news, with such varied levels of sentiment? It certainly highlights the potential influence that the media can have on voter opinion, especially when you consider how many people see each article/headline. The figures below represent social shares for a single news article;

Screen Shot 2016-11-17 at 09.43.44

Bear in mind, these figures don’t represent the number of people who saw the article, they represent the number of people who shared it. The actual number of people who saw this on their social feed will be a high-multiple of these figures. In fact, we grabbed the average daily social shares, per story, and graphed them to compare;

Average social shares per story

Pretty even, and despite Trump being mentioned over twice as many times as Clinton during this sixty day period, he certainly didn’t outperform her when it came to social shares.


Since the 2016 US election was decided there has been a sharp focus on the role played by news and media outlets in influencing public opinion. While we’re not here to join the debate, we are here to show you how you can deep-dive into news content at scale to uncover some fascinating and useful insights that can help you source highly targeted and precise content, uncover trends and assist in decision making.

To start using our News API for free and query the world’s news content easily, click here.

News API - Sign up



However, we are always keen to speak with potential candidates for various roles here at AYLIEN. If you’re interested in joining the team, we would love to hear from you. Please email your CV to

At AYLIEN we are using recent advances in Artificial Intelligence to try to understand natural language. Part of what we do is building products such as our Text Analysis API and News API to help people extract meaning and insight from text. We are also a research lab, conducting research that we believe will make valuable contributions to the field of Artificial Intelligence, as well as driving further product development (see this post about a recent publication on aspect-based sentiment analysis by one of our research scientists for example).

We are excited to announce that we are currently accepting applications from students and researchers for funded PhD and Masters opportunities, as part of the Irish Research Council Employment Based Programme.

The Employment Based Programme (EBP) enables students to complete their PhD or Masters degree while working with us here at AYLIEN.

For students and researchers, we feel that this is a great opportunity to work in industry with a team of talented scientists and engineers, and with the resources and infrastructure to support your work.

About us

We’re an award-winning VC-backed text analysis company specialising in cutting-edge AI, deep learning and natural language processing research to offer developers and solution builders a package of APIs that bring intelligent analysis to a wide range of apps and processes, helping them make sense of large volumes of unstructured data and content.

With thousands of users worldwide and a growing customer base that includes great companies such as Sony, Complex Media, Getty Images, and McKinsey, we’re growing fast and enjoy working as part of a diverse and super smart team here at our office in Dublin, Ireland.

You can learn more about AYLIEN, who we are and what we do, by checking out our blog and two of our core offerings – our Text Analysis API and News API.

About the IRC Employment Based Programme

The Irish Research Council’s Employment Based Programme (EBP) is a unique national initiative, providing students with an opportunity to work in a co-educational environment involving a higher education institution and an employment partner.

The EBP provides a co-educational opportunity for researchers as they will be employed directly by AYLIEN, while also being full time students working on their research degree. One of the key benefits of such an arrangement is that you will be given a chance to see your academic outputs being transferred into a practical setting. This immersive aspect of the programme will enable you to work with some really bright minds who can help you generate research ideas and bring benefits to your work that may otherwise not have come to light under a traditional academic Masters of PhD route.


The Scholarship funding consists of €24,000pa towards salary and a maximum of €8,000pa for tuition, travel and equipment expenses. Depending on candidates’ level of seniority and expertise, the salary amount may be increased.

Our experience with the EBP

AYLIEN is proud to host and work with two successful programme awardees under the EBP, Sebastian Ruder and Peiman Barnaghi. Both Sebastian and Peiman have been working under the supervision of Dr. John Breslin, who is an AYLIEN advisor and lecturer at NUI Galway and Insight Center. We also have academic ties with University College Dublin (UCD) through Barry Smyth. Barry is a Full Professor and Digital Chair of Computer Science at UCD, and recently joined the team at AYLIEN as an advisor.

Screen Shot 2016-11-02 at 14.59.33Back row, left to right: Peiman and Sebastian with Parsa Ghaffari, AYLIEN Founder & CEO

Sebastian Ruder

Throughout his research, Sebastian has developed language and domain-agnostic Deep Learning-based models for sentiment analysis and aspect-based sentiment analysis that have been published at conferences and are used in production. His main research focus is to develop efficient methods to enable models to learn from each other and to equip them with the capability to adapt to new domains and languages.

The Employment Based Programme for me brings academia and industry together in the best possible way: It enables me to immerse myself and get to the bottom of hard problems; at the same time, I am able to collaborate with driven and inspiring individuals at AYLIEN. I find this immersion of research-oriented people like myself sitting next to people that are hands-on with diverse technical backgrounds very compelling. This stimulating and fast-paced working environment provides me with direction and focus for my research, while the ‘get stuff done’ mentality allows me to concentrate and accomplish meaningful things” – Sebastian Ruder, Research Scientist at AYLIEN

Here are some of Sebastian’s recent publications:

  • INSIGHT-1 at SemEval-2016 Task 4: Convolutional Neural Networks for Sentiment Classification and Quantification (arXiv)
  • INSIGHT-1 at SemEval-2016 Task 5: Deep Learning for Multilingual Aspect-based Sentiment Analysis (arXiv)
  • A Hierarchical Model of Reviews for Aspect-based Sentiment Analysis (arXiv)
  • Towards a continuous modeling of natural language domains (arXiv

Peiman Barnaghi

Peiman’s research, in collaboration with the Insight Centre for Data Analytics, NUI Galway, focuses on Scalable Topic-level Sentiment Analysis on Streaming Feeds. His main focus is working on Twitter data for Sentiment Analysis using Machine Learning and Deep Learning methods for detecting polarity trends toward a topic on a large set of tweets and determining the degree of polarity.

Here are some of Peiman’s recent publications:

  • Opinion Mining and Sentiment Polarity on Twitter and Correlation between Events and Sentiment (link)
  • Text Analysis and Sentiment Polarity on FIFA World Cup 2014 Tweets (PDF)

You can read more about our experience with the EBP in the Irish Research Council’s Annual Report (pages 29 & 31)

Details & requirements

First and foremost, your thesis topic must be something you are passionate about. While prior experience with the topic is important, it is not crucial. We can work with you to establish a suitable topic that overlaps with both the supervisor’s general area of interest/research and our own research and product directions.

Suggested read: Survival Guide to a PhD by Andrej Karpathy

We are particularly interested in applicants with interests in the following areas (but are open to other suggestions):

  • Representation Learning
  • Domain Adaptation and Transfer Learning
  • Sentiment Analysis
  • Question Answering
  • Dialogue Systems
  • Entity and Relation Extraction
  • Topic Modeling
  • Document Classification
  • Taxonomy Inference
  • Document Summarization
  • Machine Translation

You have the option to complete a Masters (1 year, or 2 years if structured) or a PhD (3 years, or 4 years if structured) degree.

AYLIEN will co-fund your scholarship and provide you with professional guidance and mentoring throughout the programme. It is a prerequisite that you spend 50-70% of your time based on site with us and the remainder of the time at your higher educational institute (HEI).

Open to students with a bachelor’s degree or higher (worldwide) and you will ideally be based within a commutable distance of our office in Dublin City Centre.


It would be ideal if you have already identified or engaged with a potential supervisor at a university in Ireland. However, if not, we will help you with finding a suitable supervisor.

Important dates and deadlines

Please note: all times stated are Ireland time and are estimates based on last years programme. Full details will be released in December.

Call open: 6 December 2016

FAQ Deadline: 8 February 2017 (16:00)

Applicant Deadline: 15 February 2017 (16:00)

Supervisor, Employment Mentor and Referee Deadline: 22 February 2017 (16:00)

Research Office Endorsement Deadline: 1 March 2017 (16:00)

Outcome of Scheme: 26 May 2017

Scholarship Start Date: 1 October 2017

How to apply

To express your interest, please forward your CV and accompanying note with topic suggestions to



It takes less than 4 minutes for a piece of news to spread across Online, TV and Radio. In the office or on the road, Streem connects you with news monitoring from each source, mention alerts, keyword and industry tracking, and realtime audience analytics – delivered live to your Desktop or Mobile device within a minute of publication or broadcast.

Track competitors, produce reports, take news intelligence wherever you go with Australia’s fast, flexible and trusted news intelligence platform.



Media monitoring and aggregation apps are changing the way news is discovered and consumed. Competition within this space is increasing immensely as each new app and service promises to deliver a more personalized and streamlined experienced than those that have come before. Ultimately, the winners and losers in this battle for market share will be decided by those who best understand the content they are sharing, and use this knowledge to provide cutting edge personalization, in-depth analytics and reader satisfaction.

The Challenges

Frustrated with both the accuracy and ROI they were seeing from an incumbent solution, Streem decided to evaluate their options in sourcing an alternative provider. They had 3 key points of consideration in evaluating and benchmarking various solutions; Performance, Cost and Setup investment.

Streem’s customers require targeted, informed and flexible news alerts based on their individual interests. Therefore, what the team at Streem required was a fast, API-based service that allowed them to analyze large streams of content in as close to real-time as possible.

Dealing with vast amounts of content, Streem needed the ability to intelligently identify mentions of people, organizations, keywords and locations while categorizing content into standardized buckets. An automated workflow would allow them to scale their monitoring beyond human capabilities and deliver targeted news alerts as close to publication time as possible.

Screen Shot 2016-10-25 at 15.12.59The Solution

Using the AYLIEN Text Analysis API, Streem have built an automated content analysis workflow which sources, tags and categorizes content by extracting what matters using entity/concept extraction and categorization capabilities.

Key points of information are extracted from each individual piece of content and then analyzed using Natural Language Processing (NLP) and Machine Learning techniques, providing Streem with a more accurate solution, faster time to value and an overall greater return on investment.

“The accuracy of Aylien was higher than competing providers, and the integration process was much simpler.” -Elgar Welch, Streem

Endpoints used

Streem are using our Entity and Concept Extraction endpoints to identify keywords, mentions of people, places and organizations along with any key information like monetary or percentage values in news articles and blogs, and our Classification endpoint to then categorize content into predefined buckets that suit their users taste.

Let’s take a closer look at each endpoint and how Streem use them within their processes;

Entity Extraction

The Entity Extraction endpoint is used to extract named entities (people, organizations, products and locations) and values (URLs, emails, telephone numbers, currency amounts and percentages) mentioned in a body of text or web pages.

Here’s an example from our live demo. We entered the URL for an article from the Business Insider and received the following results;

Screen Shot 2016-10-25 at 11.49.14Screen Shot 2016-10-25 at 11.49.29
As you can see from the results, mentioned entities are extracted and compiled. By extracting entities, Streem can easily understand what people, places, organizations, products, etc., are mentioned in the content they analyze, making it easy to provide relevant, targeted results to their users.

The Concept Extraction endpoint extracts named entities mentioned in a document, disambiguates and cross-links them to DBpedia and Linked Data entities, along with their semantic types (including DBpedia and types).


The Classification endpoint classifies, or categorizes, a piece of text according to your choice of taxonomy, either IPTC Subject Codes or IAB QAG.

We took this TechCrunch article on Tesla Motors and analyzed the URL and received the following classification results;

Screen Shot 2016-10-25 at 12.00.45

Note the the two columns labelled Score and Confident?. By providing confidence scores, users can define their own parameters in terms of what confidence levels to accept, decline or review.

The outcome


Streem now ingest and analyze tens of thousands of pieces of content on a daily basis in near real time. Their backend process, powered by the AYLIEN Text Analysis API, extracts key pieces of information on which their users can build tailored, flexible searches, alerts and informed monitoring capabilities around news events that matter to them.

Using AYLIEN’s state of the art solutions, the team at Streem now have more time to invest in their own product offering, delivering the best news aggregation service possible to their users.



In recent years, the monitoring of social media and news content has become a major aspect of business intelligence strategies, and with good reason too. Analyzing the voice of the customer provides extensive and meaningful insights on how to interpret and learn from consumer behavior. With over 2.3 billion active social media users out there, there’s a wealth of information being generated every second of every day, across a variety of social channels.

Direct access to consumer opinion was traditionally only available in closed and controlled environments like surveys and feedback groups, but today it’s accessible everywhere on the web; on social media, in reviews, in blogs and even news outlets. Hence why it’s been dubbed the modern day focus group.

Across social channels, a staggering 96% of people will talk about a brand without actually following its social media accounts. So while a company may be actively responding to direct messages and queries to its own channels, if they’re not paying attention to what is being said elsewhere, they’re missing out on a goldmine of useful and often freely available data and opportunity.

So why isn’t every company out there keeping track of every mention of their brand or products online; there’s simply too much information out there to manually keep track of. Not alone does the average Internet user have 5.54 social media accounts, but the sheer volume of chatter and content generated among them is so vast, it would be impossible to even attempt keeping up with.

When talking about a company or brand online, in some cases, the consumer will aim their message directly at the company Facebook page or Twitter handle, where it can be picked up and acted upon by a customer care rep. But what about the comments that aren’t written as a direct message?

Screen Shot 2016-10-19 at 11.19.37

Screen Shot 2016-10-19 at 11.21.22

Depending on the size of your industry, following, or customer base, there could be thousands, if not millions of similar messages scattered across the various social media channels and online review sites. It’s a mammoth challenge, but one that is being conquered and taken advantage of by savvy organizations out there. Enter Text Analysis and Natural Language Processing (NLP).

Recent advancements in Text Analysis and NLP are enabling companies to collect, mine and analyze user generated content and conversations, a level of insight and analysis at a scale that was previously not possible. If you’re not tapping into the wealth of data out there and monitoring each and every mention of your company, brand, product line or even competitors online, you’re missing out on a number of key business opportunities;

  • Crisis prevention and damage limitation
  • Research and product development
  • Customer support and retention

Crisis prevention and damage limitation

While social media is, for the most part, a public forum, many interactions between a customer and a company online will not be seen by the greater public. In many cases, direct messages to companies on social are handled swiftly and taken to private messaging, out of the public eye, where they can then be handled via email or phone call. However, it is also vital to track, compile and analyze each non-direct interaction and mention of your brand in order to spot any potentially dangerous trends that may be developing. You may, for example, begin to notice a sharp increase in the number of customers complaining about a specific aspect of your product.

What begins on social media as a customer complaint or grievance, can very quickly snowball into something far more serious, and wind up in mainstream news media, which is truly the last place you want to see your brand being portrayed in a negative light, as it’s reach and potential virality holds no bounds.

Let’s look at Samsung’s recent exploding battery crisis. On August 24, a report of an exploding Samsung Note appeared on Chinese social network Baidu. While it received some attention, one-off stories like this are often attributed to be exactly that, a one-off.

One week later, however, a second and similar report emerged from Korea. These reports were suddenly no longer refined to social channels as mainstream media quickly picked up on a developing story surrounding one of the world’s leading tech companies.

Screen Shot 2016-10-19 at 11.41.03


While Samsung were left with no choice but to recall and cease production of the Note 7, this is a prime example of how a crisis can begin with a couple of posts on social media channels and ultimately end up as one of the biggest crises the company has ever had to face.

Although the problem lays in the production of the Note 7, what is interesting to observe is the period of 6-7 days after the initial report of an exploding phone in China. Looking at news sources from this period, there appeared to be no increase in negative publicity for Samsung. In fact, the number of stories about Samsung decreased in the days following the post on Baidu.

Screen Shot 2016-10-19 at 11.51.02

The volume of stories written about Samsung trebled almost overnight after reports of a second Note 7 explosion in Korea

As soon as that second explosion was reported in Korea, however, the number of stories being written about Samsung trebled almost overnight.

The successful launch of a product or campaign relies heavily on the initial consumer reaction.. Early negative reviews can be difficult to recover from, but by monitoring consumer reactions you are giving yourself a golden opportunity to spot problems early, resolve them in a timely manner and prevent any initial negativity from snowballing.

You can quickly get a picture of what your customers are talking about, what keywords and topics appear most frequently in their commentary and whether the overall sentiment is positive or negative. This doesn’t stop with your own customers, however. You can learn just as much by monitoring mentions of your competitors and their products online.

Product research & development

From that initial lightbulb moment to the day of product launch, many opinions will be voiced about the direction this process should take. While many will have their say and provide their input, decisions that are made based on solid research data will give the product its best chance of success, both on launch day and beyond. It can be crucial, particularly in the early stages of the process, to identify trends and perform audience segmentation to help define the scope and direction it will take.

Initial research focusing on the consumer need that is to be addressed with the new product or service can focus on a number of key areas, to help pinpoint that market niche. By monitoring the voice of the customer and their reactions to existing competitors, you can quickly develop an understanding about what they are doing well, and what they (or you!) could improve on. By monitoring their  comments at scale, it’s possible to spot certain product or service aspects that are pain-points for your potential customers and react to those business insights.

A great example of this strategy in action was when L’Oreal used social listening to track the challenges people faced when dyeing their hair, what kind of tools they were using and and the color effects they desired. Not only did they uncover the trends consumers were following most, they also gained a solid understanding of the issues potential customers faced – which they could solve with help from the company’s R&D department. The resulting launch of their Feria Wild Ombre product proved to be hugely successful and helped L’Oreal widen their market as it appealed to consumers who had previously not been hair-dye users.


L’Oreal’s targeted social listening campaign proved highly successful with the launch of their Feria Wild Ombre range

Monitoring for product development doesn’t stop on launch day, however. Consumer reactions and opinions going forward are equally as important as they were pre-launch. You may be their hero today, but things can very quickly take a turn for the worse, so it is important to continuously track these opinions, learn from them, and ensure your product evolves accordingly.

Customer support and retention

People love sharing on social media. Whether we’ve just bought a shiny new car, adopted a pet or passed an exam, the chances are that many of us will share our joyous news online. However, should our new car suddenly break down, our resulting online complaints are likely to be seen by twice as many people as our initial positive posting. It’s a harsh reality that companies with an online presence simply have to accept. However, how they chose to monitor and manage such instances can be the crucial differentiator between keeping a customer or losing them to a competitor.

Screen Shot 2016-10-19 at 12.32.07


Clearly, it’s essential to keep on top of negative mentions online and provide a quick solution. We say quick because social media complainers aren’t willing to wait for 1-2 business days to get a reply. In fact, 53% of people expect a brand to respond to their Tweet in less than an hour. If you’re not listening to your customers, your competitor soon will.

It’s not all about fighting fires and resolving customer complaints on social media, however. A report from the Institute of Customer Service showed that 39% of consumers surveyed actively provide feedback to organizations online, while 31% make pre-sales enquiries. These are positive actions that companies can not only profit from, but also analyze in the same way they would negative actions. By looking at and analyzing every angle, a 360 view of consumer perception can be obtained, which enables a company to spot trends and establish their strengths and weaknesses.


Bringing it all together, we hope that we’ve provided you with some food for thought in relation to how important social media and news monitoring can be to the (initial and ongoing) success of an organization. From idea generation, to tracking your competitors and pleasing/retaining your customers, it can help you to make sense of large amounts of unstructured data online and uncover insights and trends that can boost decision making, influence the evolution of product development and help minimize the risk of damaging press from emerging.


Text Analysis API - Sign up



Risk analysis is a fundamental aspect of any business strategy. It involves identifying and assessing events and occurrences that could negatively affect an organization. The goal of risk analysis is to allow organizations to uncover and examine any risks that may be a factor to their operation as a whole, or to individual products, campaigns, projects and future plans.

How does media monitoring help with risk analysis?

Media monitoring is a major aspect of a modern risk management and analysis strategy. The sheer amount of content created online through blogs, news outlets and social media channels provides organizations with an easily accessible and publicly available source of public opinion and reactions to worldwide events as they unfold.

Put simply, media monitoring enables you to keep track of the general reaction to specific events, whether controlled or not, that might impact an organization. There is a wealth of freely available information out there, but it can be challenging to filter through the noise to find what really matters.

What are the challenges associated with media monitoring?

The main challenge that organizations face when trying to monitor content on the web is the sheer and ever-expanding volume of content being uploaded to the web each and every day. Each company will have specific reasons for mining and monitoring this content and each will have a unique variety of aspects that they are interested in looking at.

What’s required is a solution that allows for super specific and flexible search options, allowing for the sourcing of highly relevant content, as it’s published and thus providing intelligent insights, trends and actionable data from this sourced content.

How is media monitoring being used?

The ability to collect, index and analyze content at scale provides an efficient way to harness publicly available content in order to source and unearth key business insights and trends, that could potentially have adverse affects on a business.

Accuracy and timeliness of information are crucial aspects of this process and with this in mind, we’ll show you some examples of how media monitoring is being used in risk analysis and how our News API can help you keep your finger on the pulse and stay aware of important developments as they happen;

1. Monitoring public opinion and identifying threats

Public opinion towards an organization or their employees, products and brands can often be the making (or breaking) of them. Reputations of organizations among the public can often deteriorate over time and it can be crucial to try and spot such trends before they become a serious issue. To achieve this, the continuous monitoring of specific media searches is required.

Our News API supports Boolean Search which allows you to build simple search queries using standard boolean operators. This means that you can build either general or more targeted queries based on your interests and requirements. As an example, let’s search the following;

Articles that are in English, contain Samsung in the title, have a negative sentiment and were published between 60 days ago and now.

Why negative sentiment? If I’m assessing risk around a certain company, I’ll certainly want to know what they have been up to in recent times, and in particular, any bad press they have been subject to. As we know, bad press usually equates to a negative public perception.

News API results are returned in JSON format, making it easy for you to use the data as you please. Here’s a few examples of visualizations generated from the above search:

Sentiment Polarity

The red line in the graph below represents the levels of negative polarity towards Samsung over the past 60 days.

Screen Shot 2016-10-05 at 12.22.10

It’s clear from this chart that Samsung received some pretty bad press in the month of September, judging by the sharp increase in negative polarity, versus August. Now, of course, we want to know why and whether the root cause of this negativity is going to be a concern or potential risk.

Let’s generate a word cloud from our search results to uncover the most commonly used terms from the stories published during this time period.

Screen Shot 2016-10-05 at 12.22.23

It doesn’t take long to spot some potential causes for concern with battery, batteries, recall, fire, safety and problem all evident.

Samsung’s recent battery issues were well documented in the media, but what may have slipped under the radar for many was the involvement of the Federal Aviation Administration in this event, as you can see in our world cloud above. By modifying our original search to include Federal Aviation Administration, we can dive deeper into their involvement;

Screen Shot 2016-10-05 at 12.35.25

This is a prime example of how targeted searches using the News API can help unearth unforeseen threats or concerns by monitoring public opinion around the entities and events that matter most to your organization.

2. Monitoring competitor and industry activity

Monitoring and analyzing competitor activity can equip you with a wealth of information and provide hints of strategic movements that can provide you with a competitive advantage in your quest for market dominance.

Naturally, competitor activity generates a potential threat to the success of your organization. Just look at Apple and Samsung, for example, where it seems that each action either company takes is carefully scrutinized, analyzed and compared to the other. Samsung were certainly quick to react to Apple’s ‘bendy’ iPhone 6!


While it was hard to miss stories about Bendgate in the news, not all stories receive such mainstream attention and could easily be missed if you’re not looking at the relevant channels. By monitoring for mentions of specific organizations, brands, products, people and so on, you can be altered as soon as matching article is published. Not only does this make it easy to keep track of your direct competition, it can also help you keep abreast of general industry goings-on and any murmurs of potential new competitors or industry concerns.

3. Crisis management

With so many factors and variables at play and infinite external influences, no industry is immune to a potential crisis. How individual organizations react to such crises can ultimately decide whether they survive to see the end of it or not. Let’s take a look at one industry in particular that has been coming under recent scrutiny for including unsustainable or environmentally-unfriendly ingredients in many of its products – the cosmetics industry.

One such ingredient is palm oil, a substance that has been linked to major issues such as deforestation, habitat degradation, climate change and animal cruelty in the countries where it is produced. As a cosmetics manufacturer who uses palm oil, as many do, the intensifying spotlight on this substance is bound to be of considerable concern.

By monitoring mentions of palm oil in the news, these manufacturers can keep up to date with the latest developments, as they happen, putting them in a strong position to react as soon as required. Below is one such example of a story that was returned while monitoring mentions of ‘Palm Oil’ in African media;

Screen Shot 2016-10-06 at 16.58.40

Further analysis can show trends in the likes of social media shares or article length breakdown, either of which could signify a growing emphasis on Palm Oil among the public and media. Looking again at the image above, you can see the number of social shares for this particular story just beneath the title.

4. Trend analysis

With access to a world of news content and intelligent insights comes the opportunity for countless analyses and comparisons of trends. As an example, let’s search by category to see if there any noticeable differences or trends emerging from news stories in two separate countries.

The category we’ll look at is Electric Cars and the two source countries being analyzed are the UK and Australia. Below we have visual representations of the sentiment levels returned for each search, from the past 30 days.


As you can see, the vast majority of stories have been written in a neutral manner. What we’re interested in, however, is the significant difference in the levels of negative sentiment between the two countries around our chosen category.

Our results show that the Australian media are perhaps not too keen on the idea of Electric Cars, or perhaps there has been some negative publicity around the topic in recent times. On further inspection, we found that the uptake of electric cars has been extremely low in Australia compared to other countries, with manufacturers citing a lack of government assistance for this.

While this may seem like a straightforward comparison, when applied at scale it is this level of analysis that enables risk assessors to spot trends and ultimately improve their decision-making process. By analyzing multiple metrics side by side, interesting trends can emerge. Looking at the comparison below, again between the UK and Australia, it is evident that even in the past two months, the volume of stories relating to electric cars is increasing in Australia, but general interest still lags considerably behind the UK.

UK AUS stories

Business owners and project managers understand who and where potential threats can come from, and therefore have a very defined variety of entities and elements that need to be monitored. Projects that are based in, or focused on, different geographic locations will often pose their own unique threats and challenges. A multi-region project, for example, will require multiple risk assessments as part of the overall risk analysis process.


With each project comes a new set of challenges and potential threats. The more an organization can learn about these threats the greater chance they have of reducing the level of risk involved in making certain decisions or strategic moves. Media monitoring provides risk assessors with a wealth of publicly available information, from which intelligent insights, trends and analyses can be drawn.

However specific or niche your own search requirements are, with 24/7 worldwide news monitoring backed up by cutting edge Machine Learning and Natural Language Processing technology, our News API can help you with your risk analysis needs.

Ready to get started? Try the News API free for 14 days and with our Getting Started with the News API guides below, you’ll be up and running in no time.

Getting Started with the News API Part 1: Search

Getting Started with the News API Part 2: Insights

News API - Sign up



Interest in Artificial Intelligence and Machine Learning has seen a significant boom in recent times as the techniques and technologies behind them have quickly emerged from the research labs to the mainstream and into our everyday lives.

AI is helping organizations to automate routine operational tasks that would otherwise need to be performed by employees, often at a steep time and financial cost. By automating high-volume tasks, the need for human input in many areas is being reduced, creating more efficient and cost-effective processes.

Today we’re going to take a look at why we are seeing this rapid increase in interest in the areas of AI and Machine Learning, the key trends emerging, how various industries are leveraging them, and the challenges that lie ahead in a fascinating area with seemingly unlimited potential.

What are the main reasons behind the boom?

The mathematical approaches underlying Machine Learning are not new. In fact, many date back as far as the early 1800’s, which begs the question, why are we only now seeing this boom in Machine Learning and AI? The techniques behind these advancements generally require a considerable amount of both data and computational power, both of which continue to become more and more accessible and affordable to even the smallest of organizations. Significant recent improvements in computational capacities and an ever-expanding glut of accessible data are helping to bring AI and Machine Learning from futuristic fiction to the everyday norm. So much of what we do and touch on daily basis, whether in work, at home, or at play, contains some form of ML or AI element, even if we are not always aware of it.

We’re seeing this boom now because technological advancements have made it possible. Not only that, organizations are seeing clear and quantifiable evidence that these advancements can help them overcome a variety of operational problems, streamline their processes and enable better decision-making.

Screen Shot 2016-09-28 at 18.36.46

Key trends in Machine Learning & AI

Increased volume of data requires more powerful methods of analysis

Analyzing the sheer volume of data that is being generated on a daily basis creates a unique challenge that requires sophisticated and cutting-edge research to help solve. As the volume and variety of data sources continues to expand so too does the need to develop new methods of analysis, with research focussing on the development of new algorithms and ‘tricks’ to improve performance and enable greater levels of analysis.

Affordability and accessibility in the cloud

As the level of accessible data continues to grow, and the cost of storing and maintaining it continues to drop, more and more Machine Learning solutions hosting pre-trained models-as-a-service are making it easier and more affordable for organizations to take advantage. Without necessarily needing to hire Machine Learning experts, even the smallest of companies are now just an API call away from retrieving powerful and actionable insights from their data. From a development point of view, this is enabling the quick movement of application prototypes into production, which is spurring the growth of new apps and startups that are now entering and disrupting most markets and industries out there.

Every company is becoming a data company

Regardless of what an organization does or what industry they belong to, data is helping to drive value. Some will be using it to spot trends in performance or markets to help predict and prepare for future outcomes, while others will be using it to personalize their inventory, creating a better user experience and promoting an increased level of engagement with their customers.

Traditionally, organizational decisions have been made based on numerical and/or structured data, as access to relevant unstructured data was either unavailable, or simply unattainable. With the explosion of big data in recent time and the improvement in Machine Learning capabilities, huge amounts of unstructured data can now be aggregated and analyzed, enabling a deeper level of insight and analysis which leads to more informed decision-making.

How these trends are being leveraged

Machine Learning techniques are being applied in a wide range of applications to help solve a number of fascinating problems.

Contextualized data for a personalized UX

Today’s ever-connected consumer offers a myriad of opportunities to companies and providers who are willing to go that extra step in providing a personalized user experience. Contextualized experience goes beyond simple personalization, such as knowing where your user is or what they are doing at a certain point in time. Such experience has become a basic expectation – my phone knows my location, my smartwatch knows that I’m running, etc.

There is now a greater expectation among users for a deeper, almost predictory experience with their applications and Machine Learning is certainly assisting in the quest to meet these expectations. An abundance of available data enables improved features and better machine learning models to be created, generating higher levels of performance and predictability, which ultimately leads to an improved user experience.

Via Machine Learning, a person’s future actions can be predicted at the individual level with a high degree of confidence. No longer are you viewed as a member of a cohort. Now you are known individually by a computer so that you may be targeted surgically” – John Foreman

Internet of Things

As the rapid increase in devices and applications connected to the Internet of Things continues, the sheer volume of data being generated will continue to grow at an incredible rate. It’s simply not possible for us mere mortals to analyze and understand such quantities of data manually. Machine Learning is helping to aggregate all of this data from countless sources and touchpoints to deliver powerful insights, spot actionable trends and uncover user behavior patterns.

Software and hardware innovations

We are seeing the implementation of AI and Machine Learning capabilities in both software and hardware across pretty much every industry out there. For example;

Retail buyers are being fed live inventory updates and in many cases enabling the auto-replenishment of stock as historical data predicts the future stock-level requirements and sales patterns.

Healthcare providers are receiving live updates from patients connected to a variety of devices and again, through Machine Learning of historical data, are predicting potential issues and making key decisions that are helping save lives.

Financial service providers are pinpointing potential instances of fraud, evaluating credit worthiness of applicants, generating sales and marketing campaigns and performing risk analysis, all with the help of Machine Learning and AI-powered software and interfaces.

Every application will soon be an intelligent application

AI and machine learning capabilities are being included in more and more platforms and software, enabling business and IT professionals to take advantage of them, even if they don’t quite know how they work. Similar to the way many of us drive a car without fully understanding what’s going on under the hood, professionals from all walks of life, regardless of their level of education or technical prowess, are more and more beginning to use applications on a daily basis that appear simple and user-friendly on the surface, but are powered in many ways by ML and AI.

Challenges and opportunities going forward

Machine Learning has very quickly progressed from research to mainstream and is helping drive a new era of innovation that still has a long and perhaps uncapped future ahead of it. Companies in today’s digital landscape need to consider how Machine Learning can serve them in creating a competitive advantage within their respective industries.

Despite the significant advancements made in recent years, we are still looking at an industry in its infancy. Here are some of the main challenges and opportunities for AI and Machine Learning going forward;

Security concerns

With such an increase in collected data and connectivity among devices and applications comes the risk of data leaks and security breaches which may lead to personal information finding its way into the wrong hands and applications facing

Access to resources and data

Previously, only a few big companies had access to the quality and size of datasets required to train production-level AI. However, we’re now seeing startups and even individual researchers coming up with clever ways of collecting training data in cost effective ways. For example, researchers are now using GTA as an environment for training self-driving cars.

The same applies to research, as previously it was much more difficult for a startup or an individual researcher to get access to ‘Google-level’ tools and resources for conducting research within AI, now with the proliferation of open-source frameworks and libraries such as Torch, Theano, TensorFlow, etc. and also the openness around publications and sharing the results of research, we’re seeing a more level playing field in AI research in both the industry and academia.

Hype vs. reality

There is still somewhat of a disconnect between the potential impact advancements in AI will have on our world and how it’s actually being utilized in everyday life. In some cases technology providers, the media and PR teams are seen to be pushing the boundaries of the extent of what’s possible within AI and Machine Learning, and speculating what’s next. In some cases this can lead to frustration for the users of these technologies (consumer or enterprise) when these promises go unfulfilled, and that may cause a backlash at the expense of the entire AI industry.


News API - Sign up



Complex is a New York-based media platform for youth culture which was founded as a bi-monthly magazine by fashion designer Marc Ecko. Complex reports on trends in style, pop culture, music, sports and sneakers with a focus on niche cultures such as streetwear, sneaker culture, hip-hop, and graphic art. Complex currently reaches over 120 million unique users per month across its owned and operated and partner sites, socials and YouTube channels”



Digital ad sales is big business. How big? Well, we’re about to see digital ad spending in the US surpass TV for the first time, representing 37% of all US ad spending going in to 2017. That big! For publishers like Complex, engaged visitors means a greater exposure to ads, higher click rates and, as a result, they are able to generate a sustainable revenue stream across their publishing network.

A large, active and engaged target audience is exactly what advertisers like to see. As a result, publishers focus their efforts on providing unique, engaging and relevant content to their readers, which helps keep them active on their site, promotes future return visits and increases brand recognition.

The Challenge

The Complex Media Network welcomes more than 120 million unique visitors each month. Ultimately, the goal is to serve each and every individual user with relevant ads based on the content they are viewing and/or based on a number of other factors such as geographic location, demographic profile, device type, time of day and many more.

Complex offer varied ad-targeting features across their network, enabling ad partners to target readers based on the previously mentioned factors and triggers. However, until recently, they had little to offer partners on contextually placed advertisements.

A contextual advertising system analyzes website text for keywords and returns advertisements to the webpage based on those keywords. For example, if a visitor is reading an article about fashion, they can be targeted with ads for related products or services, such as clothes and sneakers.

The need for digital ads to become contextually relevant is greater than ever before as web users move away from engaging with online advertisements. These stats and figures certainly confirm the challenge faced by online publishers and emphasize the need for more targeted ad campaigns;

  • In a study, only 2.8% of participants thought that ads on websites were relevant to them.
  • A January 2014 study found that 18 to 34 year olds were far more likely to ignore online ads, such as banners and those on social media and search engines, than they were traditional TV, radio and newspaper ads.This is a huge chunk of the Complex target market.
  • The average clickthrough rate of display ads across all formats and placements is just 0.06%
  • Users who are retargeted to are 70% more likely to convert.

In particular, Complex were looking to automate video insertion in articles to help scale views across their sites. They already had an automatic video insertion widget in place, however the information being fed into it was the result of a manual process that ultimately proved to be unreliable. They required an intelligent automation of this process.

The Solution

With up to 70,000 web pages being analyzed on a daily basis, Complex needed to automate their processes by automatically categorizing and tagging articles based on topics, keywords and mentions of specific entities. This data could then be fed into their video insertion widget.

Complex display content-relevant videos towards the end of many of their articles. These videos contain pre-roll ads that are targeted specifically to the reader. For example, I was reading an article about Frank Ocean and at the bottom of this article I was offered a video related to singer.

If I’m reading a story about a certain person or topic, the chances are that I’ll be interested in viewing a related video. When I clicked play I was fed a pre-roll ad about mortgages from a bank here in Ireland. Yep, I’m currently house-hunting so this targeted ad was bang on the money!

Screen Shot 2016-09-21 at 13.38.38

Complex display content-relevant videos towards the end of many of their articles

Endpoints used

Complex are using our Concept Extraction endpoint to extract and disambiguate mentions of celebrities, companies, brands and locations from online content and our Classification endpoint to then categorize this content for indexing among their various publications and channels. Let’s take a closer look at each endpoint and how Complex use them to improve their processes;

Concept Extraction

The Concept Extraction endpoint extracts named entities mentioned in a document, disambiguates and cross-links them to DBpedia and Linked Data entities, along with their semantic types (including DBpedia and types). By extracting concepts, Complex could easily understand what people, places, organizations and brands are mentioned in the articles they publish and were then able to produce a rich tagging system to assist with their ad targeting.

Here’s an example from our live demo. We entered the URL for an article about rappers Lil Wayne, Birdman and Tyga and received the following results;

Screen Shot 2016-09-20 at 18.22.43


Originally, Complex were using keywords in their video insertion widget that were manually entered by editors via their CMS. However these proved to be unreliable and insufficient so they decided to automatically extract them using our Classification endpoint.

The Classification endpoint classifies, or categorizes, a piece of text according to your choice of taxonomy, either IPTC Subject Codes or IAB QAG. Complex classify their articles according to IAB QAG.

Using the same article, we analyzed the URL and received the following results;

Screen Shot 2016-09-20 at 18.29.42

The first category returned was Celebrity Fan/Gossip which fits the bill perfectly in this instance. Note how confidence in the other categories gradually declines. While still somewhat relevant, we declare our lack of confidence in them. By providing confidence scores, users can define their own parameters in terms of what confidence levels to accept, decline or review.

“Since working with the AYLIEN Text Analysis API, we have seen great improvement in CTRs on our video widget, which translates to preroll revenue.” – Ronit Shaham, Complex

The outcome

Understanding content at this depth has enabled Complex to place pin-point accurate videos and creative ads throughout their content in a semantic, less intrusive way. The concepts, categories and data points extracted are used to organize this content while being fed into an intelligent contextual ad recommendation engine and video insertion widget, which has led to a significant improvement in Click Through Rates from videos embedded within the content. This increase in CTRs has naturally boosted pre-roll revenues for Complex.

In particular, Complex found the extracted keywords to be most accurate among a number of solutions they trialled, which ultimately led them to choosing the AYLIEN Text Analysis API.

Complex Summary colour (1)

As consumers of online content become more and more immune to the effects of online ads, marketers and publishers are having to find ways to connect with them on a more personal level. Through a combination of data collection, text analysis and machine learning techniques, highly-personalized and targeted ads can now be served instantly, based on the content itself and viewer demographics. This really is a win-win for all involved as the visitor sees useful material, the publisher sees higher CTRs and the advertiser receives more traffic coming in from these clicks.

Wanna learn more semantics in advertising? Check out our blog post – Semantic Advertising and Text Analysis gives more targeted ad campaigns


Text Analysis API - Sign up