News API Feature Update: Image Metadata and New Social Filters
With our News API, our goal is to make the world’s news content easier to query, just like a database. Additionally, we leverage Machine Learning to process, normalize and analyze this content to make it easier for our users to gain access to rich and high quality metadata, and use powerful filtering capabilities that will ultimately help you to find the needle in the haystack more easily.
To this end, we have just launched two new handy features for filtering stories based on their image metadata and setting range queries for social media share counts. You can read more about these two features – which are now also available in our News API SDKs – below.
Image metadata filters
The news content published online is increasingly becoming multimodal, to the point that it is rare to find an article or a blog post that doesn’t include an image or a video. Our News API stats show that 83% of all the articles that we have in our index contain at least 1 image.
Therefore, it is important to be able to search and filter stories not just based on their textual content, but also based on their images.
To facilitate this, we now analyze each extracted image of each news article to capture its size (width and height), format and content length. Additionally, we have introduced 7 new parameters for filtering stories based on these attributes:
- media.images.width.min: minimum image width (in pixels)
- media.images.width.max: maximum image width (in pixels)
- media.images.height.min: minimum image height (in pixels)
- media.images.height.max: maximum image height (in pixels)
- media.images.content_length.min: minimum image content size (in bytes)
- media.images.content_length.max: maximum image content size (in bytes)
- media.images.format: image format (possible values are: JPEG, PNG, GIF, SVG, ICO, TIFF, CUR, WEBP and BMP).
As an example, let’s use these parameters to retrieve stories about Golf that have an image in JPEG or PNG format that is bigger than 80kb in size:
Here’s an image returned from the search query above:
Social range filters
One of the highly popular features of our News API is its ability to sort stories based on how many times they have been shared on social media. However, if you use this to retrieve popular stories over a long period of time, you will sometimes notice that a few highly popular stories (those that have been shared 100’s of thousands of times) would come at the top, preventing you from accessing the long tail of interesting and popular stories easily.
To battle this, we have introduced the following 8 new parameters that allow you to set range (i.e. minimum and maximum) filters on social media shares counts:
- social_shares_count.facebook.min: minimum number of Facebook shares
- social_shares_count.facebook.max: maximum number of Facebook shares
- social_shares_count.google_plus.min: minimum number of Google+ shares
- social_shares_count.google_plus.max: maximum number of Google+ shares
- social_shares_count.linkedin.min: minimum number of LinkedIn shares
- social_shares_count.linkedin.max: maximum number of LinkedIn shares
- social_shares_count.reddit.min: minimum number of Reddit shares
- social_shares_count.reddit.max: maximum number of Reddit shares
To retrieve all stories that mention Donald Trump, and have been shared between 50 and 500 times on Facebook, we can use the following query:
These filters are now available across all our News API SDKs. We hope that you find these new updates useful, and we would love to hear any feedback you may have.
To start using our News API for free and query the world’s news content easily, click here.