Good Contents Are Everywhere, But Here, We Deliver The Best of The Best.Please Hold on!
Your address will show here +12 34 56 78
Product

Introduction

At AYLIEN we pride ourselves on creating robust, scalable, developer friendly tools. We do our best to make it as easy as possible for our users to get up and running with our APIs. In this post we’re going to talk you through the process of making calls to our News API.

Our News API allows you to search and source news content from around the web in realtime. We use Machine Learning and Natural Language Processing to monitor, source and index news content at scale, providing an enriched and flexible news data source.

While making calls to the API is relatively simple, the possibilities for what you can do with the API are endless and the number of endpoints and parameters available can sometimes feel a little overwhelming to new users. For that reason we have decided to put together a series of blog posts aimed at guiding our users through some basic capabilities of the API.

This post will serve as a gentle introduction to the basic search capabilities of the API. Part 2 will then focus on how to analyze and draw meaningful insights from your stories.

First things first – familiarize yourself with the API

We recommend that you spend some time creating example queries using our demo/query builder. Once you’ve familiarised yourself with the concept of the API we’d strongly recommend spending some time in the interactive documentation section of our website to learn more about the various endpoints and parameters available in the API.

When you’ve got your head around how the API works, you can begin making some basic calls and collecting stories.

Making calls:

We’ve created SDKs for some of the most popular programming languages which make using the API super easy. Throughout this blog we’ve also included some code snippets for you to copy and build on.


1. Basic Search

Firstly, let’s look at how we can start monitoring news content and collecting stories. For the purpose of this guide we’ll assume the role of a news agency who want to gather news stories and insights around the presidential elections in the US.

Our API supports Boolean Search which allows you to build simple search queries using standard boolean operators. This means that you can build general or more targeted queries based on your interests and requirements.

The easiest way to start getting useful data from the API is to start collecting stories based around a search that interests you. As an example, let’s create a search that retrieves stories relevant to the US elections based on simple criteria;

Let’s build a query to retrieve stories that mention Donald Trump and Hillary Clinton that were published in the last 10 days and were written in Spanish.

The `text` parameter allows us to search by keyword. The ‘language’ parameter let’s you choose the language the stories are written in, and the `publishedAt` parameter allows you to easily set timeframes around when the articles you wish to retrieve have been published.

Note: Make sure you replace the APP_ID and APP_KEY placeholders with your own API credentials. If you haven’t got an API account you can sign up here.

Combined: “Donald Trump” AND “Hillary Clinton” in Spanish from the past 10 days

 


2. Targeted search

So we’ve looked at how you can use keywords and some other basic parameters like the ‘published at’ time to build some useful queries. Now let’s look at how you can use some other parameters and endpoints to build some more targeted searches.

As with any election, opinions often tend to vary by geographic location. This is evident each time a US presidential election takes place as individual states are deemed ‘Red’ or ‘Blue’ depending on their republican or democratic bias. As we know, however, it is often the ‘swing’ states that well, swing the outcome one way or the other! These states and the news around them can be of particular interest to analysts, writers and the candidates themselves.

With this in mind, we’ll now show you how to narrow your search to focus on particular entities (places, people, companies, etc), from within specific categories and news sources. The parameters we will use here are:

  • `entities.body.links.dbpedia`
  • `categories.taxonomy`
  • `categories.ID`
  • `source.name`

For this example, we will search for mentions of two entities, which are traditionally two swing states: Ohio and Florida. We also want to ensure we are only collecting relevant stories so we will search for articles that are classified as in the Law, Government and Politics category. We will also only focus our search on three popular news sources: CNN, CBS News and Reuters

Our next search will focus on finding stories with a particular sentiment polarity from a particular geographic region. While our previous entities search focused on stories about specific locations (Ohio & Florida), this search will focus on stories from outlets in a specific location.

We’re going to search for negative stories, mentioning Trump, from sources based in Mexico.

As before, we will use the `text` parameter to indicate ‘Trump’ as our keyword. You could also use `title` if you wished to search for mentions of Trump in the article title only. The parameter sentiment.title.polarity is set to ‘negative’ and `source.locations.country` to ‘MX’.


What’s next?

There you have it. We hope that this gentle introduction to using our News API has helped you get started with the basics, making calls and creating specific search queries. Of course, the API is more than just a news aggregator and we have merely scratched the surface of what this awesome tool is capable of today. You will see the true power and benefits of the API as you begin to explore some of the more advanced capabilities it offers from an analysis and insights point of view.

Check back next week for Part 2 where we will be showing you how easy it is to use the API to look for insights, trends and correlations in news content.

 




News API - Sign up




0

Data Science

Introduction

The dust has truly settled on what was one of the biggest sporting occasions of the year, the 2016 European Championships. The worldwide interest in the Euro 2016 soccer tournament was particularly evident across social media platforms with Twitter, Facebook and even Instagram seeing record numbers in tournament-related interactions over the 4 week period.

As you may have seen before, here at AYLIEN we like to monitor and gather social media and news content around particular events in search of interesting insights using our Text Mining capabilities through our APIs.

Previous posts: Super Bowl 50 according to Twitter and Text Analytics meets 2014 World Cup.

So what did we do this time?

We collected a total of 27 million tweets over the course of the tournament with the purpose of mining these tweets to look for interesting correlations and insights. Using the Twitter Search API, we built searches around official hashtags and handles for both the tournament itself and the teams involved. Following some simple preprocessing of the data, such as removing retweets and tweets containing links to narrow our focus and eliminate some noise, we moved our data to a big MySQL database which made it a lot easier to work with.

The first piece of analysis we did was to run all the tweets through our Language Detection endpoint to split them up by language. You could also use the language predictions provided by Twitter to save some time. The second piece of analysis we did was to analyze the sentiment of all of the English tweets, which amounted to about 17 million in total. The final task involved extracting mentions of Entities in these tweets, paying particular attention to mentions of the countries playing at the tournament.

We decided to dive deeper into 4 areas of interest:

  • Volume of tweets and language;
  • Teams of particular interest (Portugal, France, Iceland and England);
  • The Final game (Portugal v France);
  • And of course, Cristiano Ronaldo (yes, he gets his own section!)

 

Tools used:

Twitter Search API;

AYLIEN Text Analysis API;

AYLIEN News API;

Tableau;

 

Volume of tweets

As was to be expected, the majority of social chatter around the tournament was focused in Europe. Other areas of note included The US and Australia but perhaps most surprising was the high concentration of tweets from soccer fans in Indonesia. This was also reflected in the tweets-by-language analysis we ran, which we’ll discuss later.

Not surprisingly, the most mentioned team was the host nation and tournament runners-up, France. In second place was the champions, Portugal, and 3rd place was England who were up there for all the wrong reasons, which we’ll dive into a little bit later in the post.

While the vast majority of tweets, regardless of their geographic origin, were in English, let’s take a look at the language breakdown.

 

Tweets by language

Tweets in English accounted for over 62% of all tweets collected. 15% of tweets were written in French and about 11% were made in Spanish. Other languages to feature included Portuguese, German and Italian but the biggest surprise was the volume of tweets written in Indonesian which was also highlighted in our Geographic analysis.

Looking at the volume of tweets by language highlights some interesting insights around public interest and following throughout the tournament, revealing a clear connection between fan following and interest in the tournament as a whole.

For tweets written in French and Portuguese you can get a clear understanding of how far a team progressed in the tournament by looking at the volume of tweets written in their native language throughout the tournament. The spikes in the visualizations represent each game and the trend line shows the evident rise or fall in following.

The diminishing voice of the fan following is most evident through a clear indication of how fan following decreased throughout the tournament leading up to their departure.

 

Team Focus

As we mentioned, we decided to pick 4 teams to focus our analysis on – Portugal, France, Iceland and England. We chose teams that were either linked to major events or talking points in the tournament or had performed particularly well.

Each graph in the story below shows the volume of tweets mentioning that team, the tweet polarity (whether it’s positive or negative) and also the rolling average polarity throughout the tournament.

We’ve chosen interesting talking points for each team and highlighted when they occurred and their effect on fan reaction in each graph.

Tip: Click the linked talking points for news stories gathered with our News API.

England:

England were a terrible disappointment at the Euro 2016 tournament. The star-studded team of the Premier Leagues top players failed to impress and were knocked out of the tournament by a much weaker team (on paper) from Iceland. Their tournament was also heavily overshadowed by the behavior of their fans and the departure of their manager Hodgson only highlighted the scale of the issues the English FA had to deal with.

Talking points:

 

Iceland:

A country with a population of 323,000 and about 100 professional footballers provided us with the feel good story of the tournament. Iceland, who were really only expected to show up, did a whole lot more by coming second in their group, drawing with the eventual tournament winners and toppling one of the tournament favourites in the quarter finals.

Talking points:

 

France:

The tournament favorites France easily progressed to the knockout stages where they dealt with a far less experienced Icelandic side and impressively put 2 past the current world champions, Germany. The team which had the tournament’s top scorer was truly on form and looked like they had the tournament in the bag.

 

Portugal:

Although they eventually reigned supreme, Portugal had an all but impressive tournament. Having failed to win 6 of their 7 games in the regulation 90 minutes, they relied on snatching wins during extra time and by holding their nerve in the lottery of penalty shootouts. They even had a couple of close-calls with two far weaker teams in Iceland and Austria. The form of their main man, Cristiano Ronaldo, was at the heart of both their successes and failings throughout the tournament as the Portuguese talisman, carrying an injury throughout, could only show us glimpses of his best.

Talking points:

 

The Final

The final of Euro 2016 attracted as many as 300 Million viewers across the world. What was expected to be a high tempo showdown between a goal hungry, in-form French team and a well-drilled Portuguese team who hadn’t lost a game in the tournament turned out to be whole lot less.

Portugal’s Ronaldo and France’s Griezmann were facing off for the title of Euro 2016 top goalscorer but it was for other reasons that Ronaldo took the limelight and some would argue the sting out of the game as a whole.

Talking points:

  • Ronaldo is fouled by Payet in the 12th minute
  • Ronaldo is forced to leave the field injured after 26 minutes
  • France miss a number of close chances
  • The game enters extra time and is looking like it will go to penalties
  • Eder scores and France look to be defeated

Ronaldo

Cristiano dominated social chatter and news throughout the tournament. Usually it’s his goal tally alone which puts him in the spotlight but during Euro 2016 he was the talk of the tournament for a variety of other reasons, and we’re not even referring to the moth incident!

 

Talking points:


  • Ronaldo shows his true colors and passion for the team on the sidelines
  • Like a true goal scorer, Ronaldo never misses an opportunity…to take his top off


Conclusion

Simple case studies like this highlight the wealth of information hidden in social chatter. Brands and organizations who care about the voice of their customer have no choice in today’s world but to try and leverage social media conversations in order to stay on top of what it is their customers like or dislike about them and their competitors. If you’d like to hear more about using AYLIEN for social listening drop us a line at hello@aylien.com, we’d love to hear from you.

 




Text Analysis API - Sign up




0

General

Interest in Natural Language Processing (NLP) has risen rapidly in recent years; nowadays, NLP forms a key component of the roadmap of almost every major tech company. All share the goals of making advanced NLP capabilities accessible to developers and bringing them into the hands of consumers. Simultaneously, most existing areas in NLP are under active research, with new ideas being developed and tested at a breakneck pace. However, discussion of the potential and applications of such ideas is usually restricted to small interest groups, e.g. collaborating research teams and infrequent venues, e.g. NLP conferences.

We at Aylien are acutely aware of this dynamic: We seek to equip developers and businesses with the NLP tools they need to improve their businesses. At the same time, we are conducting cutting-edge research and continuously looking for ways to leverage this research to improve our services and help our clients.

In order to provide a regular, common forum for students, researchers, and industry professionals to discuss state-of-the-art NLP research and cutting-edge industry applications, we are thrilled to announce the NLP Dublin meetup. We hope that this group will facilitate the exchange of ideas within the Irish NLP community and bring people into contact with ideas and applications they might have otherwise not heard about. Every event will feature presentations about interesting areas of NLP, QA sessions, and ample time for discussions and networking.

The first meetup will take place on August 3. If you are interested in speaking at or sponsoring future meetups, please contact sebastian@aylien.com or click the banner below.

 

tumblr_inline_oam44nPbXr1u37g00_540
0

Product, Research

It is a strong indicator of today’s globalized world and rapidly growing access to Internet platforms, that we have users from over 188 countries and 500 cities globally using our Text Analysis and News APIs. Our users need to be able to understand and analyze what’s being said out there, about them, their products, services, or their competitors, regardless of the locality and the language used.

Social media content on platforms like Twitter, Facebook and Instagram can provide unrivalled insights into customer opinion and experience to brands and organizations. However, as shown by the following stats, users post content in a multitude of languages on these platforms:

  • Only about 39% of tweets posted are in English;
  • Facebook recently reported that about 50% of its users speak a language other than English;
  • Native platforms such as Sina Weibo and WeChat, where most of the content is written in a native language, are on the rise;
  • 70% of active Instagram users are based outside the US.

A look at online review platforms such as Yelp and TripAdvisor, as well as various news outlets and blogs, reveals similar patterns regarding the variety of language used.

Therefore, no matter if you are a social media analyst, or a hotel owner trying to gauge customer satisfaction, or a hedge fund analyst trying to analyze a foreign market, you need to be able to understand textual content in a multitude of languages.

The Challenge with Multilingual Text Analysis

Scaling Natural Language Processing (NLP) and Natural Language Understanding (NLU) applications – which form the basis of our Text Analysis and News APIs – to multiple human languages has traditionally proven to be difficult, mainly due to the language-dependent nature of preprocessing and feature engineering techniques employed in traditional approaches.

However, Deep Learning-based NLP methods, which have gained a tremendous amount of growing attention and popularity over the last couple of years, have proven to bring a great amount of invariance to NLP processes and pipelines, including towards the language used in a document or utterance.

 

 

At AYLIEN we have been following the rise and the evolution of Deep Learning-based NLP closely, and our research team have been leveraging Deep Learning to tackle a multitude of interesting and novel problems in Representation Learning, Sentiment Analysis, Named Entity Recognition, Entity Linking and Generative Document Models, with multiple publications to date.

Additionally, using technologies such as TensorFlow, Docker and Kubernetes, as well as software engineering best practices, our engineering team ensures this research is surfaced in our products by ensuring our proprietary models are performant and scalable, enabling us to serve millions of requests every day.

Multilingual Sentiment Analysis with AYLIEN

Today we’re excited to announce an early result of these efforts with the launch of the first version of our Deep Learning-based Sentiment Analysis models for short sentences which are now available for English, Spanish and German.

Let’s explore a couple of examples and see these new capabilities in action:

Examples:

A Spanish tweet:

“Vamos!! Se ganó, valio la pena levantarse temprano, bueno el futbol todo lo vale :D”

Results:

A German tweet:

“Lange wird es mein armes Handy nicht mehr machen 🙁 Nach 5 Jahren muss ich mein Samsung Galaxy S 2 wohl bald aufgeben”

Results:

Try it out for yourself on our demo, or grab a free API key and an SDK to leverage these new models in your application.

How it Works

Our new models leverage the power of word embeddings, transfer learning and Convolutional Neural Networks to provide a simple, yet powerful end-to-end Sentiment Analysis pipeline which is largely language agnostic.

Additionally, in contrast to more traditional machine learning models, this new model allows us to learn representations from large amounts of unlabeled data. This is particularly valuable for languages such as German where manually annotated data is scarce or expensive to generate, as it enables us to train sentiment models that leverage small amounts of annotated data in a language to great effect.

 

Source: Training Deep Convolutional Neural Network for Twitter Sentiment Classification by Severyn et al.

 

Next steps

Over the next couple of months, we will be continuing to work on improving these models as well as rolling out support for even more languages. Your feedback can be extremely helpful in shaping our roadmap, so if you have any thoughts, ideas or questions please feel free to reach out to us at hello@aylien.com.

We are also excited about the new research that we’ve been doing on cross-lingual embeddings, which should make the process of multilingual Sentiment Analysis even easier.

 




Text Analysis API - Sign up




2