Getting started with the News API Part 1: Search

Getting started with the News API Part 1: Search

Welcome to the AYLIEN News API!

 

Our News API uses state-of-the-art Machine Learning and Natural Language Processing to allow you to search and source news content from around the web in real-time, providing you with an enriched and flexible news data source.

 

By the end of this short blog post you’ll be up and running with the Python SDK and familiar enough with the documentation to be able to pick and choose the parameters and features that will get you the results you need (we also have SDKs in six languages besides Python, which you can check out here).

 

Not an SDK fan?

If you want to start accessing the News API via HTTP requests instead of the SDKs, you should check out the query builder in the demo, which is an easy way to generate a URL you can use to access the API with a GET or POST request.

 

Want to skip ahead to the more advanced stuff?

The interactive documentation section of the News API docs lists every parameter for every endpoint so you can build a granular query.

 

Ready to get started?

You can download the Python SDK from our GitHub repository, or by using pip with “pip install aylien_news_api”.

With the SDK downloaded, let’s start making some calls!

1. Basic Search: keywords, dates, sorting

The Stories endpoint is what you’ll use when you’re accessing stories from the News API.

 

Here’s a template script that will give you ten stories, sorted by popularity of Facebook, that mention the keyword “Trump” in the title, were published in the last seven days, and are in English.

 

All you need to do to run this script is to copy and paste your API key and Application ID to replace the placeholders (if you haven’t got a News API account you can sign up here).


import aylien_news_api
from aylien_news_api.rest import ApiException

## Configure API key authorization: app_id
aylien_news_api.configuration.api_key['X-AYLIEN-NewsAPI-Application-ID'] = 'YOUR_APP_ID'
## Configure API key authorization: app_key
aylien_news_api.configuration.api_key['X-AYLIEN-NewsAPI-Application-Key'] = 'YOUR_APP_KEY'
## create an instance of the API class
api_instance = aylien_news_api.DefaultApi()

## List our parameters as search operators
opts= {
    'title': 'Trump',
    'sort_by': 'social_shares_count.facebook',
    'language': ['en'],
    'published_at_start': 'NOW-7DAYS',
    'published_at_end': 'NOW'
}

try:
    ## Make a call to the News API for stories that meet the criteria of the search operators
    api_response = api_instance.list_stories(**opts)
    print(api_response)

except ApiException as e:
    print('Exception when calling DefaultApi->list_stories: %s\n' % e)

You can change around the search operators to see what other stories you can source. Our API supports Boolean Search, which allows you to easily build more general or more targeted queries based on your interests and requirements.

 

For example, add this parameter to narrow the previous search to include only stories that also mention either of the keywords “Putin” or “Jinping” in the body of the story:


'body': 'Putin OR Jinping'

Searching for keywords in the title ensures that you are returned stories that are by and large written about that keyword. To broaden your search to stories that just mention your keyword, supply that term to the “body” parameter instead of “title”.

You can also specify the amount of stories you want to return with the per_page parameter. You can return up to 100 stories per page, if you want to access more than 100 at a time, you can use cursor.


'per_page': 100

2. Targeted search: categories, entities, sentiment

So we’ve looked at how you can use keywords and some other basic parameters like the ‘published at’ parameters to build a query to access stories. Now we’re going to add in some parameters that use some really simple but powerful features that the News API offers.

 

The categories parameter allows you to search for stories about any of the over 900 subject categories from two taxonomies the News API labels each story with (you can search them here).

 

We can specify the category that interests us with two lines to add to our parameters, the first one specifying the taxonomy we intend to use (IPTC or IAB-QAG), and the second one specifying the subject category (or categories) that we want stories about.

 

The entities parameter allows you to specify stories that mention entities you specify. Entities differ from keywords as they have been analyzed and disambiguated by the News API.

 

For example, searching for the ‘apple’ keyword will return mentions of both the fruit and the technology company, which will definitely return the wrong data when we scale up our searches. Searching for the entity “http://dbpedia.org/resource/Apple_Inc.” will return only stories that talk about the company.

 

The sentiment parameter allows us to return only stories that talk in a positive, negative, or neutral tone. We can also specify stories that are subjective or objective.

 

Here is a scripts that uses these parameters to search for stories that are negative in tone, are classified as being about “disasters” in the IPTC taxonomy, and mention either the entities of Russia, Ford, or the Pacific Ocean.

Instead of printing the entire JSON object, we’re just going to print the story’s title and link


import aylien_news_api
from aylien_news_api.rest import ApiException

aylien_news_api.configuration.api_key['X-AYLIEN-NewsAPI-Application-ID'] = 'YOUR_APP_ID'
aylien_news_api.configuration.api_key['X-AYLIEN-NewsAPI-Application-Key'] = 'YOUR_APP_KEY'
api_instance = aylien_news_api.DefaultApi()

opts= {
    'sort_by': 'recency',
    'language': ['en'],
    ## declare what taxonomy you want to use
    'categories_taxonomy': 'iptc-subjectcode',
    ## provide the id for the "disasters" category
    'categories_id': ['03015000'],
    ## list the entities that you are interested in
    'entities_body_links_dbpedia': [
        'http://dbpedia.org/resource/Russia',
        'http://dbpedia.org/resource/Ford_Motor_Company',
        'http://dbpedia.org/resource/Pacific_Ocean'
    ],
     ## define the sentiment
    'sentiment_body_polarity': 'negative',
}

try:
    api_response = api_instance.list_stories(**opts)
    for story in api_response.stories:
        print(story.title, story.links.permalink)

except ApiException as e:
    print('Exception when calling DefaultApi->list_stories: %s\n' % e)

If you want to search for different entities, the url is usually the last token of its Wikipedia URL appended to the dbpedia URI after “resource/”. You can also see the full list of categories that the News API classifies stories by here.

 

3. Localized search: source locations and popularity

So by this stage you’ve already made queries using basic parameters like keywords and dates as well as parameters that leverage NLP like topic category and entities that are mentioned.

 

Using the News API, we can also search for stories by leveraging information about the source that publishes the story. Of the tens of thousands of sources that the News API monitors, we can specify stories from sources that are based in or write about specific countries or cities, and also by how popular the publisher is.

 

The News API determines how popular a site is by finding its Alexa ranking. This is a system that ranks website domains by how much traffic they generate. For example, as the most popular website in the world, Google has an Alexa rank of 1, whereas a small news site might have a ranking of between 500,000 and 1,000,000.

 

Specifying sources with a high Alexa ranking (say, between 1 and 10,000) will eliminate unpopular news websites that don’t generate a lot of traffic. You can also use the national Alexa ranking to specify sources that are popular in a specific country, which is a really useful way to find sources that are popular in a particular location.

 

The following script adds in parameters to specify stories that were published by sources in the United States, and have are among the 1000 most popular websites in the world:


import aylien_news_api
from aylien_news_api.rest import ApiException

aylien_news_api.configuration.api_key['X-AYLIEN-NewsAPI-Application-ID'] = 'YOUR_APP_ID'
aylien_news_api.configuration.api_key['X-AYLIEN-NewsAPI-Application-Key'] = 'YOUR_APP_KEY'
api_instance = aylien_news_api.DefaultApi()

opts= {
    'title': 'Trump',
    'language': ['en'],
    'published_at_start': 'NOW-14DAYS',
    'source_rankings_alexa_rank_min':  1000,
    'source_locations_country': ['US'],
}

try:
    api_response = api_instance.list_stories(**opts)
    for story in api_response.stories:
        print(story.title, story.source)

except ApiException as e:
    print('Exception when calling DefaultApi->list_stories: %s\n' % e)

 

What’s next?

These three scripts should give you enough understanding of the News API for you to make some granular queries about whatever interests you. From this point, we recommend looking into either the Interactive Documentation for a full list of the query parameters you can leverage, the SDK guides, or some of these blog posts that show the News API in action:

 

If you haven’t signed up to the News API yet, click on the image below to start a free, 14-day trial and get access to news content enriched by Natural Language Processing.





News API - Sign up




Let's Talk