Good Contents Are Everywhere, But Here, We Deliver The Best of The Best.Please Hold on!
Your address will show here +12 34 56 78


We’ve discussed what semantic search is in our previous post, What is Semantic Search?.

In this blog, we’re going to provide some insight into how you can start to ensure you’ve got semantic search at the forefront of your SEO strategy to stay in line with Google’s recent push towards contextual, intent based search results.

We’ve spoken about updates like Panda and Hummingbird and highlighted how important semantics in search is, in modern SEO strategies. We expanded on how Search Engines are looking past exact keyword matching on pages to providing more value to end users through more conceptual and contextual results in their service.

While the focus has moved away from exact keyword matching, keywords are still a pivotal part of SEO and content strategies, but the concept of a keyword has changed somewhat. Search strings are more conversational now, they are of the long-tail variety and are often, context rich.

So, how can we optimize for Semantic Search?

  1. Understand searcher intent
  2. Create quality content that answers questions and delights readers
  3. Build authority around topics

Traditionally, keyword research involved building a list or database of relevant keywords that we hoped to rank for. Often graded by difficulty score, click through rate and search volume, keyword research was about finding candidates in this list to go create content around and gather some organic traffic through exact matching.

While this method of keyword research is still relevant the landscape has changed. So a search phrase like “Machine Learning” can have multiple meanings and varied context.

  • Machine Learning guide
  • What is Machine Learning?
  • Machine Learning in the enterprise
  • Machine Learning Algorithms

And that’s just focusing on search terms that contain the string “Machine Learning”. What about those terms like “support vector machines” that are still very much related to Machine Learning but don’t explicitly use the search term but, are important terms, that we want to rank for.

Semantic search has meant these type of terms, long tail keywords and related search terms are becoming more and more powerful and important in a modern day SEO strategy.

With semantic search in mind, it’s important to build more meaningful keyword lists or databases that are rich in context and take the searchers intent into account.

How do we build this semantic rich keyword list?

The first thing you need to do is to start thinking of personas. Try and figure out what your target user, who is interested in “Machine Learning” is searching for on Google. Cause they’re not always going to search “Machine Learning”. We’re not just gathering a long list of keywords we want contextually rich keywords that are relevant to our topic.

The second step is to start by identifying what your core topics and concepts are, think of these as the root of all your keywords in your list or database. Move away from “Machine Learning” as a keyword and think of it as a topic.

What we are trying to do is figure out what are the keywords, phrases and search terms that are relevant to our topic and where else should we go to find related concepts other high performing content content relevant to our topic.


Tapping into popular content to source target keywords and concepts that are related to your business, is an excellent way of starting to build your semantic keyword list. In this case, I’ve used a tool called, Buzzsumo to curate relevant and high performing content that’s related to Machine Learning.


Now we have our content, how do we go about mining it for relevant concepts?

Automated Text Analysis and Natural Language Processing can provide tremendous insight when it comes to building keyword lists. They can be used to simply extract keywords and entities to build a simple keyword list based off occurrences, or you can get a bit more advanced and go a level deeper by automatically extracting concepts and topics from the same content.

In the example below, we exported the top 30 articles and blogs from Buzzsumo and analyzed them in a spreadsheet with AYLIEN Text Analysis add-on in order to extract concepts mentioned in order to start building a Semantically Rich keyword list.



The same could be done to extract keywords from articles using the Entity Extraction endpoint, but in this case we’re more focused on concepts and topics so we chose 5 of the most prominent concepts from which to build our semantic list of keywords.

Semantically Related Keywords

The idea here is, to gather a list of those keywords, that may be very relevant to our overall topic, but may not explicitly mention it. Think about the Support Vector Machines example again.

We can do this by using AYLIEN’s Related Phrases endpoint. This end point gives you a list of words or phrases that are semantically related to you input. You can generate this list by analyzing your initial list of keywords or even better use a handful of the most relevant concepts you grabbed from the curated content links, as we’ve done below.



Our goal in the beginning was to enrich our keyword research process with more context focused and semantically related keywords from which to build a content or SEO strategy on.

Using Text Analysis techniques and focusing on topics and concepts we’ve generated a list of semantically related keywords that we know are highly relevant to our overall topic “Machine Learning” and are based on concepts and not keyword matching. With Google now focused on understanding searches, looking beyond keyword matching being aware of your target Topics, Concepts and related keywords in your SEO and content strategy is sure-fire way of holding your positions or moving up the ladder.

You can access the APIs mentioned through our spreadsheet add-on or directly via our Text Analysis API.


Text Analysis API - Sign up



This is the first in a two part series on, “The importance of Semantics in search”. In this blog, we’ll focus mainly on what Semantic Search is and why it’s so important to be aware of today.

How is search changing?

We’ve become increasingly reliant on search engines and we certainly demand a lot of them. We’re searching from multiple devices, we’re using speech recognition apps, we search from different locations and we expect accurate results that answer our query. We’re not just keyword searching anymore, we’re essentially asking questions. We expect search engines to understand our searches and come back with accurate results that answer our question.

SERPs have seen a definitive move by users away from keyword focused searches to more contextual searches based around entities and concepts, not just a string of words.

So which came first? The chicken or the egg?

Google in particular, has had user experience and user understanding at the forefront of its mission since the beginning. They have invested heavily in providing a more accurate context-aware search, especially notable within the last 2 or 3 years with algorithm updates like Penguin and Hummingbird. Google recognised how its users were conducting searches and how their usage had changed and they adapted accordingly.


First came penguin, an algorithm update that aimed at cutting out “black hat SEO tactics” like keyword stuffing and poor value or spammy backlinks. Penguin meant that it was no longer OK to concentrate solely on keyword focused marketing strategies and volume link building.

Penguin meant SEOs and marketers had to conduct more context based keyword research focusing on long tail searches or phrases and they needed to write content that answered questions, building sites for humans, not robots. This was one of the first indicators that Google was moving towards a more context based search experience.


Different to Penguin in that it was essentially an algorithm rewrite, Hummingbird was Google’s way of telling the search world that, they are now focusing their efforts on better understanding their users and their users searches and moving further away from the presence of keywords in results towards concepts with this semantics focused update.

How do they understand searches? Semantic search

Semantic Search is a data searching technique in which a search engine aims not only to find keywords but to determine the intent and contextual meaning of the words a person is using for search. (Techopedia)

Put simply, Semantic Search seeks to improve search accuracy by understanding the intent and context of the searcher in order to provide more relevant results. The intent being what the user is looking for and context being everything else that surrounds the search like location and device for example.

Semantics is now at the heart of Google’s search technology. They rely heavily on semantic understanding by searching for Entities and Concepts for example and use Natural language Processing techniques like Word Sense Disambiguation in order to truly understand a user’s search.

We’ll look at a simple example to illustrate our point.


Let’s say I search “jaguar” in Google. I’ll get the following results displayed below.



My search, in this case, was pretty minimal, with very little context to it, but Google worked hard to take a stab at what I meant and displayed results about Jaguar cars including details of my nearest dealer, without displaying any results about the big cat. Most likely based off my location and previous search history.

If however, I add further context to my search it means Google can understand it better, with say a search term like “jaguar cubs”.




The more information I provided meant Google could easily understand my intent and even disambiguate my use of jaguar to mean the animal and not the car.

So while the value of keywords is decreasing and the job of an SEO is becoming more difficult Semantic Search is most definitely improving user experience and search engine performance. It’s also rewarding those SEOs and marketers who create good quality content that answers search queries or questions.

Understanding how Semantic Search works is pivotal in leveraging it for better search performance. Our next edition will focus on how you can use NLP and Text Analysis to improve your chances of playing nice with Google and staying high in your rankings.


Text Analysis API - Sign up


Our most recent feature addition to the API, Related Phrases, automatically provides you with a list of semantically similar phrases and words based on an input phrase or word provided by the user. Put simply, given a phrase made up of as many words as you like (unigram, bigram or n-gram) our API will return a list of words or phrases, that are similar or related.

So how does it work?

Below is an example of Related Phrases in action, using “iPhone” as the example phrase. The screenshot from our demo below shows the returned results, along with a distance score. The distance score indicates how similar the context of the suggested words/phrases is to one that the source words/phrases might appear in, and the higher the distance score the more closely related they are.

In other words we build a profile/signature for each word based on its surrounding words in a corpus (English Wikipedia for instance). The profile we build indicates, among other things, how likely each word is to appear with any/all other words in the corpus.




So we’ve looked at a simple example using “iPhone” as our input, but we’ve also built a cool little sample app in our Sandbox which illustrates a neat little idea we had, of how the feature could be used.

The idea for the sample app was to take a story or even a news headline and change it up automatically without affecting its meaning, with the possibility of sometimes improving its word structure. We do this by extracting Keywords and Entities from a sample string and swapping out one of those keywords or entities and replacing them with related words or phrases.

We managed to get some pretty interesting results, which we’ve displayed below.

Note: We’ve included the code below for you to copy and paste or even add to build something cooler. Just remember to swap out your App ID and App Key which you would have received when you registered with AYLIEN. If you haven’t registered, you can do so here.

Code Snippet:

var AYLIENTextAPI = require('aylien_textapi'),
  _ = require('underscore'),
  request = require('request'),
  parseString = require('xml2js').parseString;

var textapi = new AYLIENTextAPI({
  application_id: 'YourApplicationId',
  application_key: 'YourApplicationKey'

request('', function (error, response, body) {
  if (!error && response.statusCode == 200) {
    parseString(body, function(error, feed) {
      var story = _.sample(feed.rss['channel'][0].item).description[0] || undefined;
      if (story) {
        textapi.entities(story, function(error, entities) {
          var keyword = entities.entities.keyword[0];
          textapi.related(keyword, function(error, related) {
            var random_phrase = _.sample(related.related.slice(0,5));
            console.log("- Story: " + story);
            console.log("- Selected keyword: " + keyword);
            console.log("- Related phrases: " + _(related.related.slice(0,5)).pluck('phrase').join(', '));
            console.log("- New Story: " + story.replace(new RegExp(keyword,"g"), '*'+random_phrase.phrase+'*'));

As you can see from the code, we’re grabbing a story from the BBC top stories feed and running the following analyzes on it to determine what words to replace.


  • Entity Extraction
  • Keyword Extraction


Then we’re choosing the most prominent keyword and passing that through the Related Phrases endpoint, to generate our auto-generated phrase or words. Choosing one phrase or word at random we replace it in the original “story” to essentially rewrite a part of that article snippet, as is illustrated in the example results below.

Sample App Results:

- Story: Climate change risks should be assessed in the same way as threats to national security according to a new report.
- Selected keyword: threats
- Related phrases: threat, threatening, potential threats, serious threats, threaten
- New Story: Climate change risks should be assessed in the same way as *serious threats* to national security according to a new report.
- Story: A human rights group publishes a video it says contradicts the account of an Israeli army officer who shot dead a Palestinian teenager a week ago.
- Selected keyword: Israeli
- Related phrases: israel, idf, israelis, israel defense forces, palestinian
- New Story: A human rights group publishes a video it says contradicts the account of an *israel defense forces* army officer who shot dead a Palestinian teenager a week ago.
- Story: A city council is opposing a request to hand over an area of land for the site of a new free Sikh school in Derby.
- Selected keyword: land
- Related phrases: lands, large tracts, large tract, property, acres
- New Story: A city council is opposing a request to hand over an area of *property* for the site of a new free Sikh school in Derby.

Other Use Cases

Cool huh? Even cooler than that, we thought would it be possible to rewrite an entire passage or article by replacing keywords and entities with other related words and phrases without losing meaning and readability. We haven’t got around to building that yet, but there’s free swag on offer for anyone who can do it! ;).

There are some pretty cool and quirky ideas out there for how this endpoint could be used. One of the most useful ideas we’ve had was, to use the Related Phrases end point as part of, the keyword research process in SEO/search marketing. This process proves very useful in generating long tail phrases and semantically related search terms, which are becoming more and more important in Google’s eyes as it moves away from keyword spotting. Keep an eye out for a blog on that very use case in the coming days.

Related Phrases is now live in production and can be used by all of our API subscribers, both free and paid. If you want to try it out right now check out our demo and sandbox or read about it in our docs.

Text Analysis API - Sign up



This is the second edition of our NLP terms explained blog posts. The first edition deals with some simple terms and NLP tasks while this edition, gets a little bit more complicated. Again, we’ve just chosen some common terms at random and tried to break them down in simple English to make them a bit easier to understand.

Part of Speech tagging (POS tagging)

Sometimes referred to as grammatical tagging or word-category disambiguation, part of speech tagging refers to the process of determining the part of speech for each word in a given sentence based on the definition of that word and its context. Many words, especially common ones, can serve as multiple parts of speech. For example, “book” can be a noun (“the book on the table”) or verb (“to book a flight”).


Parsing is a major task of NLP. It’s focused on determining the grammatical analysis or Parse Tree of a given sentence. There are two forms of Parse trees Constituency based and dependency based parse trees.

Semantic Role Labeling

This is an important step towards making sense of the meaning of a sentence. It focuses on the detecting semantic arguments associated with a verb or verbs in a sentence and the classification of those verbs into into specific roles.

Machine Translation

A sub-field of computational linguistics MT investigates the use of software to translate text or speech from one language to another.

Statistical Machine Translation

SMT is one of a few different approaches to Machine Translation. A common task in NLP it relies on statistical methods based off bilingual corpora such as the Canadian Hansard corpus. Other approaches to Machine Translation include Rule Based Translation and Example-Based Translation.

Bayesian Classification

Bayesian classification is a classification method based on Bayes Theorem and is commonly used in Machine Learning and Natural Language Processing to classify text and documents. You can read more about it in Naive Bayes for Dummies.

Hidden Markov Model (HMM)

In order to understand a HMM we need to define a Markov Model. This is used to model randomly changing systems where it is assumed that future states only depend on the present state and not on the sequence of events that happened before it.

A HMM is a Markov model where the system being modeled is assumed to have unobserved or hidden states. There are a number of common algorithms used for hidden Markov models. The  Viterbi algorithm which will compute the most-likely corresponding sequence of states and the forward algorithm, for example, will compute the probability of the sequence of observations and both are often used in NLP applications.

In hidden Markov models, the state is not directly visible, but output, dependent on the state, is visible. Each state has a probability distribution over the possible output tokens. Therefore the sequence of tokens generated by an HMM gives some information about the sequence of states.

Conditional Random Fields (CRFs)

A class of statistical modeling methods that are often applied in pattern recognition and machine learning, where they are used for structured prediction. Ordinary classifiers will predict labels for a sample without taking neighboring samples into account, a CRF model however, will take context into account. CRF is commonly used in NLP (e.g. in Named Entity Extraction) and more recently in image recognition.

Affinity Propagation (AP)

AP is a clustering algorithm commonly used in Data Mining, unlike other clustering algorithms such as, k-means, AP does not require the number of clusters to be estimated before running the algorithm. A semi-supervised version of AP is commonly used in NLP.

Relationship extraction

Given a chunk of words or a piece of text determining the relationship between named entities.


Text Analysis API - Sign up


As you may already know, we recently joined up with Blockspring to provide access to our Text Analysis API service in their functions library. To showcase what Blockspring and AYLIEN Text Analysis API can do we’ll be sharing some interesting API mashups we’ve built, that we think you’ll find useful.

For this edition we’re going to show you how to build an Automated Content Curation Tool inside a spreadsheet with little or no programming experience.

What you’ll need for this mashup:

Step 1. Get set up

Open a blank spreadsheet, start your Blockspring module and sign in using your Blockspring credentials. (It’s located in your add-on section of the tool bar)

Step 2. Gather your articles

In the Blockspring console (sidebar) there will be a list of suggested functions and an option to browse or search functions. The first function we’re going to use is the “Get recent news about a topic” function. This function utilises the Google News API to pull in recent articles based on a query you suggest.

Once you’ve chosen this function select the “Insert into a new sheet button” this will open a new sheet loaded with a sample function. You’ll notice that the function has an example query pre-loaded, replace the sample query with your own personal search query and hit enter.

In this example we’re going to monitor news about Machine Learning, Deep Learning and Text Analysis, using the following query:

“machine learning” OR “deep learning” OR “Text Analysis”

We’re also going to change the number of articles we want to pull from the Google news API. When you load the function it automatically pulls in 5 articles. We’ve changed it to 30 to try and get some more coverage.

So now we have our articles listed in our spreadsheet, but to get through them all it would take some time to read every one, or scan through each URL to decide, what is relevant, what we might share or what for example might be about our competitors that we should be aware of.

Step 3. Summarize the content

To summarize the articles we’re going to analyze each URL with the AYLIEN API. Choose the “Summarize URL with AYLIEN function” in your Blockspring console and hit insert into selected cell.

The function will have an example URL in it, you need to replace this with the target cell you want to analyze. In our case it was B13. Heres what your new function should look like:

=BLOCKSPRING("summarize-url-with-aylien", "url", b13, "max_sentences", 3)

Notice we’ve also changed the number of sentences you want the article summarized to. We’ve decided on 3 sentences but that’s totally up to you.

Once you’ve hit enter, the function will run and populate the results automatically. You can reuse the same function for the rest of the articles by dragging the function down through the remaining cells.

Step 4: Get Hashtag suggestions for each article

We’ve found it very useful when sharing the content we’ve created to include 2 or 3 hashtags to ensure you get maximum exposure. AYLIEN automatically suggests these for you by running the Hashtag Suggestion function.

As before choose “Suggest Hashtags for URL with Aylien” in the Blockspring console and select insert into selected cell. Again, the function will be loaded with a sample URL, replace this with your target cell and hit enter.

You’ll notice that the suggested hashtags are populated vertically through the cells below, which kind of messes up our spreadsheet. Use the “transpose” function to populate them horizontally. Your new formula should look like this:

=TRANSPOSE((BLOCKSPRING("suggest-hashtags-for-url-with-aylien", "url", B13)))

As we’ve done before drag this down through the rest of the cells to get suggestions for every article.

With a little bit of cosmetic work you can clean your sheet and make it look a little prettier. Try wrapping the text for your Summaries and Titles and add some headings to a frozen row to make it a bit easier on the eye.

Every time you open the spreadsheet it will update automatically and curate useful and shareable content for you based on topics you want covered. We find it particularly useful to check it each morning to stay on top of important content. You can also utilize more of the AYLIEN functions like Entity Extraction and Concept Extraction to look for mentions of companies and people for example.

Isn’t that freakin’ cool?

We’d love to hear about some of the mashups your building with AYLIEN on Blockspring. Let us know if you’ve hacked together something useful recently. We might even feature it on our blog!

Text Analysis API - Sign up