Data Science

2017 in review with the AYLIEN News API

It’s now the end of an eventful year that saw the UK begin negotiations to leave the EU, the fight of the century between a boxer and a mixed martial artist, and the discovery of alternative facts. The world’s news publishers reported all of this and the countless other events that shaped 2017, leaving a vast record of what the media was talking about right through the year.

Using Natural Language Processing, we can dive into this record to generate insights about topics that interest us. Our News API has been hard at work gathering, analyzing, and indexing over 25 million news stories in near-real time over 2017. The News API extracts and stores dozens of data points on every story, from classifying the subject matter to analyzing the sentiment, to listing the people, places, and things mentioned in every one.

This enriched content provides us with a vast dataset of structured data about what the world was talking about throughout the year, allowing us to take a quantitative look at the news trends of 2017.

Using the News API, we’re going to dive into two questions on topics that dominated last year’s news coverage:

  1. What was the coverage of Donald Trump’s first year in office like?
  2. What trends affected sports coverage – consistently the most popular category – in 2017?


Trump’s first year in office

How much did the media publish?

Any review of 2017’s news has to begin with Donald Trump and his first year in office as President. To begin with, we wanted to see how the US President was covered over the course of the year, to see which events the media covered the most. To do this, we used the Time Series endpoint to analyze the daily volume of stories that mentioned Trump in the title.

Take a look at what the News API found:


From this chart, you can see that the media are generally less interested in Trump now than they were during the first month or two of his presidency. Despite the coverage of the Charlottesville protests, the media fixation on Trump is slowly tapering off.


How did sentiment in the coverage of Trump vary over the year?

Knowing what the media was the most interested in about the President is useful information, but we can also track the sentiment expressed in each one of these stories, and see how the overall sentiment polarity changed over time.

Again using the Time Series endpoint, we can do this. Take a look at what the News API found:

You can see that the News API detected the most negative sentiment in stories about Trump around the time of his call with a fallen US soldier, where he reportedly said to the soldier’s wife, “he knew what he was signing up for”. The most positive sentiment was detected around the time of Trump’s speech in Riyadh, and as the NFL kneeling controversy began to expand.

You will also notice spikes in positive sentiment in stories about Trump around both his administration’s repeal of DACA, and as more and more NFL players joined in the kneeling protests. We think that since both of these spikes follow shortly after the events that the coverage is most likely about the reactions or backlash towards these developments.


What other things were mentioned in stories about Trump?

So we know how both the volume of stories about Trump and their sentiment varied over time. But knowing exactly what other people, organizations, and things were mentioned in these stories across the year would let us see what all of these stories were about.

The News API extracts entities mentioned in in every story it analyzes. Using the Trends endpoint, we can search for the 100 entities that were most frequently mentioned in stories about Trump in 2017. These entities are visualized below.

Perhaps unsurprisingly, we can see that Trump coverage was dominated by his campaign’s and administration’s involvement with Russia. But what is quite remarkable is the scale to which it dominated the coverage that Russia was mentioned in more stories with ‘Trump’ in the title than the US itself.


What were the most-shared stories about Trump in 2017?

Seeing which stories were shared the most on social networking sites can be very interesting. It can also yield some important business insights as the more a story is shared, the more value it generates for advertisers and publishers.

We can do this with the News API by using the Stories endpoint. Since Facebook consistently garners the most shares of news stories of all the social networks, we returned the top three stories:

  1. Trump Removes Anthony Scaramucci From Communications Director Role,” The New York Times – 1,061,494 shares.
  2. Trump announces ban on transgender people in U.S. military,” The Washington Post – 696,341 shares.
  3. Trump admin. to reverse ban on elephant trophies from Africa,” ABC News – 638,917 shares.


2017 in Sports Coverage

Sports is the subject that the media writes the most about, by quite a bit. This is reflected in the fact that the News API gathered over five million stories about sports in 2017, more than any other single subject category.

To make sense of this content at this scale, we need to first understand the subject matter of each story. To enable us to do this, the News API classifies each story according to two taxonomies.

To analyze the most popular sports, we used the Time Series endpoint to see how the daily volume of stories about the four most-popular sports varied over time. We searched stories that the News API classified as belonging to the categories Soccer, American Football, Baseball, and Basketball in the advertising industry’s  IAB-QAG taxonomy. To narrow our search down a bit, we decided to look into Autumn, the busiest time of year for sports.

Take a look what the News API returned:

We can see that the biggest event that caused a spike in stories was Mike Pence’s out-of-the-ordinary appearance at an NFL game as the NFL kneeling protests expanded, a game from which he left after the players kneeled during the playing of the national anthem.

Other than this, the biggest spike in stories was clearly caused by the closing of the English transfer window on the last day of August, showing the dominant presence of soccer in the world’s media outlets.


Who and what were the media talking about?

Being able to see the spikes in the volume of sports stories around certain events is a useful resource to have, but we can use the News API to see exactly what people, places, and organizations were talked about in every one of the over 25 million stories it gathered in 2017.

To do this, we again used the Trends endpoint to find the most-mentioned entities in sports stories from 2017. Take a look at what the News API found:

You can immediately see the dominance of popular soccer clubs in the media coverage, but locations that host popular NFL and NBA teams are also featured prominently. However, soccer has a clear lead over its American competitors in terms of media attention, probably due to the global reach of soccer.


What were the most-shared sports stories on Facebook in 2017?

The Time Series endpoint showed us that the NFL kneeling protests were the most-covered sports event of 2017. Using the News API, we can also see how many times each one of the over 25 million stories was shared across social media.

Looking at the top three most-shared sports stories on Facebook, we can see that the kneeling protests were the subject of two of them. This shows us that the huge spike in story volume about these protests were responding to genuine public demand – people were sharing these stories with their friends and followers online.

  1. Wife of ‘American Sniper’ Chris Kyle Just Issued Major Challenge to NFL – Every Player Should Read This,” Independent Journal-Review – 830,383 shares.
  2. Vice President Mike Pence leaves Colts-49ers game after players kneel during anthem,” Fox News – 829,466 shares.
  3. UFC: Dana White admits Mark Hunt’s UFC career could be over,” New Zealand Herald – 772,926 shares.


Use the News API for yourself

Well that concludes our brief look back at a couple of the biggest media trends of 2017. If there are any subjects of interest to you, try out our free two-week trial of the News API and see what insights you can extract. With the easy-to-use SDKs and extensive documentation, you can make your first query in minutes.


News API - Sign up



Will Gannon

Marketing @ AYLIEN A Classics graduate from UCD, Will handles Inbound Marketing here at AYLIEN. Before joining us, Will completed a Master’s in Digital Humanities at Trinity College, where he used NLP methods to index where Latin terms appear in English Literature.