Good Contents Are Everywhere, But Here, We Deliver The Best of The Best.Please Hold on!
Your address will show here +12 34 56 78

Introduction

Our founder, Parsa Ghaffari, gave a talk recently on Natural Language Processing and Sentiment Analysis at the Science Gallery in Dublin. As part of the talk, he put together a nice little example of how you can transform your Google Spreadsheet into a powerful Text Analysis and Data Mining tool.

In this case, he took a simple example of analyzing restaurant reviews from a popular review site but the same could be done for hotels, products, service offerings and so on.

He wanted to show how easy it can be for data geeks and even the less technical marketers among us, to start analyzing text and gathering business insight from the reams of textual data online today.

So what are we going to do?

All within a Google Spreadsheet we’re going to show you how to:

    • Extract reviews for restaurants from TripAdvisor
    • Analyze the sentiment of these reviews
    • Extract Keywords (positive and negative)
    • Correlate the Sentiment with the onsite TripAdvisor Ratings

 

Step 1. Extract the Reviews:

Create a new Google Sheet and add the URL’s of the web pages that house the reviews you want to analyze to it.

Open a new sheet within the same document. This is where we’ll extract the reviews from the web page and get them ready for analysis.

There is a really simple formula you can use to extract the reviews automatically:

=IMPORTXML(cell_containing_tripadvisor_link, "//div[@class='below_the_fold']//p[@class='partial_entry']/text()")

The =IMPORTXML() formula takes a link and an XPath query and extracts pieces of information from the link. Can think of the XPath query as an address to where the content lies.

Using this formula you can add all of the reviews extracted from the web page to your target column in your analysis sheet.

We’ve also used the same function as above, to extract the TripAdvisor Ratings from the Web pages. These are star ratings out of 5 that are entered by the reviewer.

=IMPORTXML(cell_containing_tripadvisor_link, "//div[@id='REVIEWS']//img[contains(@class,'rating_s_fill')]/@alt")

 

Step 2. Analyze the Sentiment and extract keywords:

To start analyzing whether the reviews are positive or negative, you need to first start up your Text Analysis addon. If it’s your first time using it, check out our tutorial here.

Once you have the addon started, you can start analyzing the Sentiment of the reviews in the first column. To do this we’re going to use an AYLIEN Text Analysis specific function:

=SentimentPolarity(cell_containing_review)

This will analyze the text in the target cell you choose and display the results, whether it’s positive, negative or neutral. To run the same analysis on all the reviews, just simply drag the formula down the column as far as needs be.

With the polarity results, using a pivot table, we created a basic visualization, shown below, which gives a high-level insight into what percentage of customer reviews are positive or negative.

 

Step 3. Keywords:

We use another function from the Text Analysis addon similar to the Sentiment one to extract keywords from the text of the reviews.

=Keywords(cell_containing_review)

This will extract and display the main keywords from the text of each review. Like the last function drag it down through the column to run the analysis on all the reviews.

The second thing we’re going to do with keywords, which is optional, is to attempt to understand what terms or words, are used most often in positive or negative reviews. This should help us on our way to understanding what the good and bad aspects are for each review. Is it the food, the service or something else, for example.

In order to extract keywords with the highest association to each sentiment (positive or negative), we have concatenated all the extracted keywords for each graph, separated individual keywords using a mix of =JOIN() and SPLIT() functions.

To display the results we then created a pivot table to find out how frequently a particular keyword has appeared in positive or negative reviews.

 

Step 4. Correlation:

We have given numerical values to both the sentiment score (Negative -1, Positive 1) and the star ratings (1-5) to make it possible to see if there is any correlation between the two.

To do this, create a new sheet called Correlation and pull in the values from columns D and F and lay them out as shown below.

To run a correlation analysis, choose the cells containing all of your data and in a new cell use the Correlation function below to get the Correlation Coefficient:

=CORREL(A2:A20, B2:B20)

A positive correlation coefficient shows that the two series are moderately correlated, which is also evident when the two series are plotted.

That’s just a couple of examples of how you can transform you Google Spreadsheet into a powerful Data Mining and Text Analysis tool, to extract insights that could be invaluable to a business owner or a brand.

 

0

Content marketing is a major part of any digital marketing strategy. Most of us, are making a concerted effort to create high-quality content on a regular basis, that our target market value. Content marketing isn’t just about content creation, it’s also about staying on top of trends, analyzing what works and what doesn’t and watching your competition closely to understand what they write about, whats working for them and so on. Competitors blogs can be an excellent source of inspiration on what to do or even, what not to do.

This blog will take you through how we keep a close eye, on some of our major competitors.

We’ve hacked together a nice little workflow in a Google Spreadsheet that monitors and analyzes what type of content our competitors are putting out there. First off, we’re going to show you how we automatically extract blog URLs from a competitors site and populate them in a spreadsheet, then we’re going to show you what we do with them to extract some insight.

Setup

For this hack, you’re going to need the following:

  • 10 minutes
  • Google Spreadsheets
  • AYLIEN Text Analysis Add-on (Free to download)
  • Competitors to spy on

Step 1.

Create a Google Spreadsheet with two sheets. Sheet 1, we’re going to use to extract our URLs and Sheet 2, we’re going to use to analyze the URLs.

So what exactly do we mean by analyze? We’re going to do the following:

  • Summarize their blog posts
  • Extract Concepts present in them
  • Generate Keywords from their content
  • Identify themes for each post

Step 2.

Add your competitors blog URL to the spreadsheet. You have two options on how you actually pull your data into your spreadsheet.

Option 1. You can choose to scrape the page with the following XML Function. (best for competitors with a central hub of blog content)

=IMPORTXML(cell containing content hub URL, "//a/@href")

Option 2. You can use their RSS feed from their homepage.

=IMPORTFEED(SPLIT(Feeds(cell containing homepage URL), ", "))

For this blog, we’re going to use the XML method as one of our main competitors, has all of their blog content nicely listed on the one page. This method will pull down every link on the page you specify and list them in a spreadsheet as shown below.

 

image

 

Some of the links will be interesting to you and some won’t. Just decide what you want to keep and discard the rest. In our case, we kept anything that was on the blog.competitor subdomain as the rest of the links are just standard web pages.

Step 3.

Copy all of the desired links or blog URLs into your second spreadsheet and set up the sheet as shown below.

 

image

 

Lay out the sheet however you want. We just labeled a few columns for each particular analysis we planned on doing.

Analyses

Now the fun starts… Open your Text Analysis add-on (download it here if you haven’t already). You’ll see it appear as a toolbar, on the right-hand side of your spreadsheet. If you need help using the add-on, check out our tutorial page.

The first thing we’re gonna do is summarize all of the blog posts. To do this, you need to highlight all the cells you want to analyze. Choose the summarization feature in your TA add-on, choose the number of sentences you want and hit go.

If you need help getting up and running with the add-on check out our tutorial page here.

Summaries

The summaries are a nice way of transforming the blogs into consumable chunks. You get to decide how long you want the summaries to be.

 

image

 

In the image above you can see the first blog post summarized into 3 sentences.

Note: When you summarize all your links your sheet might look a little messy, clean it up by wrapping the text in the summary column.

Concepts

Extracting Concepts is just as easy as summarizing the blog posts. Again, highlight the cells you want to analyze, choose concepts in the analysis type in your sidebar and hit analyze. You can decide whether you want the results generated in new columns or separated by a comma. It’s totally up to you.

Quick Tip:

If you want to do some more advanced analysis on your results, like creating graphs, for example, it might be easier to separate the results into new columns.

Keywords

Extracting Keywords is a little different. To grab a list of keywords from the blog, we’re going to use an AYLIEN spreadsheet formula (you can see all formulas here in our cheat sheet). The formula we’re gonna use is really simple.

Formula:

=keywords(cell containing URL)

Type the formula into a desired cell and hit enter to extract the keywords from each blog post.

 

image

 

You’ll be left with a sheet that looks like the one below. It lists the URL for quick access to the blog post. It has a brief summary of each post to make it easier to read. Concepts help to track what they’re writing about and the Keywords give you an understanding of what long and short tail keywords they could be ranking for.

 

image

 

Because the data is in a spreadsheet it’s really easy to slice and dice it however you want. You can also get a high-level view of themes and topics throughout a blog by running concept extraction on the keywords you extracted. This gives a clean quick-look reference of the general topics your competitor covers on their blog.

At AYLIEN, we’ve found this hack really useful for keeping a close eye on competitors. We use it to monitor competitor Keyword strategies, Carrying out our own Keyword brainstorming and competitor intelligence as a whole.





Text Analysis API - Sign up




0

Just this week, we added Image Tagging capabilities, to our API. The result of a strategic partnership with Imagga, our Image Tagging endpoint is the first step in providing a Hybrid Text and Image analysis API.

So how does it work?

Our Image Tagging, uses advanced image recognition and deep learning technology, to recognize and identify objects in an image. From a dataset of over 6000 objects, it then suggests candidate tags for that image along with confidence scores.

The Image Tagging feature automates image annotation, categorization and tagging, a task that is often time-consuming and laborious. It’s designed to reduce the heavy lifting involved in dealing with images.

The quickest way to get up and running with this endpoint is to grab an SDK of your choice and check out our detailed documentation.

Analyzing and Tagging images

For this blog post, we’re going to run through a couple of simple examples to showcase the image tagging endpoint using Node.js.

As a simple example, let’s analyze the following image and see what type of tags the API suggests for it. It’s pretty clear to the human eye that it’s a house, but what exactly will the API actually recognize in the image? Let’s find out!

image

Image: http://suburbanfinance.com/wp-content/uploads/2013/04/streetinfo.jpg

Code:


textapi.imageTags(
    'http://suburbanfinance.com/wp-content/uploads/2013/04/streetinfo.jpg',
    function(err, response, ratelimits) {
        if (err !== null) {
            console.log("Error: " + err);
        } else {
            response.tags.forEach(function(t) {
                console.log("Tag : ", t.tag + " , " + "Confidence : ", t.confidence);
            });
        }
    });

Results:


Tag :  mansion , Confidence :  0.47509114790445056
Tag :  driveway , Confidence :  0.4552631524330092
Tag :  house , Confidence :  0.4304875785176193
Tag :  architecture , Confidence :  0.33942988236938837
Tag :  building , Confidence :  0.3094331548632239
Tag :  home , Confidence :  0.2697299038194921
Tag :  estate , Confidence :  0.23534657602532882
Tag :  dwelling , Confidence :  0.23327752776762856
Tag :  structure , Confidence :  0.22539033224250088
Tag :  residential , Confidence :  0.2150613973041477
Tag :  residence , Confidence :  0.19566665898999108
Tag :  housing , Confidence :  0.19091773340752613...

The results will be returned displaying a number of candidate tags and a confidence score for each. The confidence score tells you how sure the API is that the tag it suggested is correct. It faired pretty well with this image and returned some pretty accurate results.

Let’s try something a little different this time, the last image was a little easy. Let’s say we want to analyze the image below.

image

Image: http://www.tencate.com/amer/Images/American%20football29-15343.jpg

Code:


textapi.imageTags(
    'http://www.tencate.com/amer/Images/American%20football29-15343.jpg',
    function(err, response, ratelimits) {
        if (err !== null) {
            console.log("Error: " + err);
        } else {
            response.tags.forEach(function(t) {
                console.log("Tag : ", t.tag + " , " + "Confidence : ", t.confidence);
            });
        }
    });

Results:

Tag :  football , Confidence :  1
Tag :  stadium , Confidence :  0.24106947076803517
Tag :  back , Confidence :  0.17014749275097063
Tag :  structure , Confidence :  0.12840672955565677
Tag :  game , Confidence :  0.1186865418441251
Tag :  sport , Confidence :  0.10223847715591786
Tag :  field , Confidence :  0.09827027407477171
Tag :  team , Confidence :  0.09808334899832503
Tag :  ball , Confidence :  0.09400727111771828
Tag :  people , Confidence :  0.09279897917842066...

Again, the API returned a number of accurate and relevant tags. This time it recognized it was a picture related to Football and Sport and returned a variety of results, based on what it saw.

The Image Tagging feature, is not just able to spot objects, but it can also recognize humans and certain aspects in a photo of a person. Let’s see what it recognizes in this photo, of pop singer, Beyonce.

image

Image:https://ronehiphopnc2.files.wordpress.com/2013/05/beyonce-knowles-closeup-1024x576.jpg

Code:


textapi.imageTags(
    'https://ronehiphopnc2.files.wordpress.com/2013/05/beyonce-knowles-closeup-1024x576.jpg',
    function(err, response, ratelimits) {
        if (err !== null) {
            console.log("Error: " + err);
        } else {
            response.tags.forEach(function(t) {
                console.log("Tag : ", t.tag + " , " + "Confidence : ", t.confidence);
            });
        }
    });

Results


Tag :  attractive , Confidence :  0.3641480251093146
Tag :  portrait , Confidence :  0.3570435098374641
Tag :  adult , Confidence :  0.3419036072980034
Tag :  model , Confidence :  0.3381432174429411
Tag :  person , Confidence :  0.32852624615621456
Tag :  pretty , Confidence :  0.3228676609396025
Tag :  complexion , Confidence :  0.3164959079957736
Tag :  face , Confidence :  0.313449951118162...

The results when it comes to an image like the one above are a little bit more advanced or intelligent. The API is confident that the image of Beyonce is an attractive portrait of a model, which isn’t far off.

So, that’s how easy it is to start analyzing images with AYLIEN. Keep an eye out for our next blog on, common and not so common use cases, for hybrid text and image analysis.

0

We’re very excited to announce, we’ve teamed up with the guys from Imagga, to bring image analysis capabilities to our package of API’s. In a mutually beneficial partnership, we’ll be making our first step in the hybrid Text and Image analysis space with an image tagging end point being made available through our existing service.

 

image

 

The partnership, a match made in heaven, will add image analysis capabilities to our existing Natural Language Processing and Machine Learning API’s creating an all in one content analysis suite.

The addition of image analysis capabilities was a natural next step for us in enhancing our offering. As Parsa, our founder put it; “In today’s media, text and images are two sides of the same coin and must be analyzed and understood in tandem to get a holistic view of what’s going on in the world. There’s a strong demand among our customers for image analysis capabilities, and after evaluating a number of potential partners Imagga came out on top. We’re quite similar as companies in many ways and share the same values and goals. This is just the first of many more hybrid solutions to come.”

The image analysis endpoint utilizes machine learning technology, image recognition and deep learning algorithms to identify up to 6,000 distinct objects, concepts and facial expressions and colours automatically in images.

Our interactive demo has a number of example images which showcase the Image analysis feature.

 

image

 

Once an image is analyzed the API will automatically return a list of candidate tags for that image, as well as a confidence measure, indicating how sure the service is of the particular tags it suggests.

 

image

 

Paying customers of our Text Analysis service will get immediate access to the image processing features. The feature addition gives AYLIEN users far greater capabilities, in how they analyse and understand content at scale, whether they’re dealing with images, text or both.

The image tagging feature is fully documented in our API docs and has been added to our various SDK’s. We’ll also be following up later this week with a “how to blog” for analysing images through the API.

0

Introduction

Are you finding it hard to identify press targets for your latest release or product update?

Targeting relevant journalists that might be interested in covering your company on their blog or news outlet can be pretty difficult. Where do you even start? Buy a list? Manually trawl through tech blogs and news sites? Tracking down journalists is one thing, tracking down journalists that are interested in what you do is a different story altogether.

Faced with this same problem we thought long and hard about how we should source our press targets and all the time it was staring us in the face.

Who are they? Where are they?

Think about it! What do journalists who you want to write about your company, already write about??Yep. Your competitors!Most tech companies with an online presence have a press, or in the news section, on their website. It’s usually a page on their site where they collate links to press coverage on their company. A gold mine for relevant journalists if you will.
We’re going to show you how you can, in a matter of minutes, trawl your competitors sites for press coverage and extract the names of the journalists. We’ll create a list of relevant journos based on who has been writing about our competitors. The idea is pretty simple list all of our competitors, gather their press coverage, extract the authors automatically from those links and start hunting for emails.
Here’s what you’ll need:

Step 1. Gather the links

To get the links into a spreadsheet we’re going to use a pretty standard XML Function:

=IMPORTXML(url_cell, "//a/@href")

This will pull all of the links on that specific page into our spreadsheet. Some of them, you will want to keep and some you can forget about.

Grab the ones that are relevant, all the press coverage ones, and paste them into your second sheet. Once we have all the links gathered in one sheet we’re gonna start analyzing the links and extracting what we’re after, in this case, authors names.

Step 2. Analyze the links

Analyzing the links is really easy. You’ll need to get a free copy of the AYLIEN Text Analysis add-on from the add-on store, if you haven’t already done so. If you’re new to using the add-on this tutorial guide will help.

Once you’ve got that running, you’ll see it populate on the right-hand side of your spreadsheet, you can start analyzing the links and extracting author names.

Use the function:

=Author(url_cell)

This will extract the author’s name from the web page. The author function is one of many AYLIEN specific functions available with the add-on.

Drag the function down as you would with any Google Sheets function to populate authors for the rest of the URLs.

There you have it, the names of a couple of hundred journalists, pulled from one page and laid out nicely for you in a spreadsheet.

Step 3. Find their emails (the tricky part)

The hardest and most laborious part of this hack is a bit of a pain in the behind! Getting the email addresses for these PR targets. For a while, we used Stack Lead (acquired by Linkedin) and more recently we tried Hey Press, but sometimes you just have to put in the hard yards and jump on google.

A nice little trick we use though is by speeding up the search process using the hyperlink function

=HYPERLINK("http://google.com/search?q=" & author_cell & " " & REGEXEXTRACT(url_cell, "^https?://([^/?#]+)(?:[/?#]|$)") & " email")

image

 

This brings you straight to a new tab in your browser, that searches for your targets name, the domain and email in the one search. You’ll usually find an email address for them nestled somewhere in those search results. Otherwise, you’ll find a Twitter handle or Linkedin account which is also a good place to start.

Happy Hunting!

 

image

0

This is the sixth in our series of blogs on getting started with AYLIEN’s various SDKs.

If you’re new to AYLIEN and you don’t have an account yet, you can go directly to the Getting Started page on the website which will take you through the signup process. We have a free plan to get started with that allows you to make up to 1,000 calls per day for free.

Downloading and Installing the Java SDK

All of our SDK repositories are hosted on Github. For the Java SDK, the Text Analysis API is published to Maven Central, so simply add the dependency to the POM:


<dependency>
  <groupId>com.aylien.textapi>
  <artifactId>client>
  <version>0.1.0>
</dependency>

Once you’ve installed the SDK you’re ready to start coding. For the remainder of this blog we’ll walk you through making calls and show the output you should receive in each case. Taking simple examples we’ll showcase some of the API’s endpoints like Language detection, Sentiment Analysis and hashtag suggestion.

Configuring the SDK with your AYLIEN credentials

Once you’ve received your AYLIEN APP_ID and APP_KEY from the signup process and have downloaded the SDK you can begin making calls with the following imports and configuration code.


import com.aylien.textapi.TextAPIClient;
import com.aylien.textapi.parameters.*;
import com.aylien.textapi.responses.*;

TextAPIClient client = new TextAPIClient(
        "YourApplicationId", "YourApplicationKey");

When calling the various API endpoints you can specify a piece of text directly for analysis or you can pass a url linking to the text or article you wish to analyze.

Language Detection

First off, let’s take a look at the language detection endpoint. As a simple example we’re going to detect the language of the following sentence: ‘What language is this sentence written in?’

To do this, you can call the endpoint using the following piece of code.


String text = "What language is this sentence written in?";
LanguageParams languageParams = new LanguageParams(text,null);
Language language = client.language(languageParams);    System.out.printf("nText : %s",language.getText());
System.out.printf("nLanglanguage : %s",language.getLanguage());
System.out.printf("nConfidence %f",language.getConfidence());

You should receive an output very similar to the one shown below. This shows that the language detected was English and the confidence that it was detected correctly (a number between 0 and 1) is very close to 1, which means you can be pretty sure it is correct.

Language Detection Results


Text : What language is this sentence written in?
Langlanguage : en
Confidence 0.999997

Sentiment Analysis

Next, we’ll look at analyzing the sentence “John is a very good football player” to determine it’s sentiment i.e. whether it’s positive, neutral or negative. The Sentiment Analysis endpoint will also determine if the text is subjective or objective. You can call the endpoint with the following piece of code


text = "John is a very good football player!";
SentimentParams sentimentParams = new SentimentParams(text,null,null);
Sentiment sentiment = client.sentiment(sentimentParams);
System.out.printf("nText : %s",sentiment.getText());
System.out.printf("nSentiment Polarity   : %s",sentiment.getPolarity());
System.out.printf("nPolarity Confidence  : %f",sentiment.getPolarityConfidence());
System.out.printf("nSubjectivity : %s",sentiment.getSubjectivity());
 System.out.printf("nSubjectivity Confidence: %f",sentiment.getSubjectivityConfidence());

You should receive an output similar to the one shown below which indicates that the sentence is objective and is positive, both with a high degree of confidence.

Sentiment Analysis Results


Text : John is a very good football player!
Sentiment Polarity   : positive
Polarity Confidence  : 0.999999
Subjectivity : objective
Subjectivity Confidence: 0.989682

Hashtag Suggestion

Finally, we’ll look at analyzing a BBC article to extract hashtag suggestions for it with the following code.


HashTagsParams hashtagsParams = new HashTagsParams(null,url,null);
HashTags hashtags = client.hashtags(hashtagsParams);
System.out.print("Hashtags : n");
System.out.print(hashtags + "n");

You should receive the output shown below.

Hashtag Suggestion Results


Hashtags :
#Planet #JohannesKepler #Kepler #Birmingham #Earth #Astronomy #TheAstrophysicalJournal #Warwick #Venus #Orbit #Mercury #SolarSystem #Resonance #TerrestrialPlanet #Lightyear #Imagine

If Java’s not your preferred language then check out our otherSDKs for node.js, Go, PHP, Python, Ruby and .Net (C#). For more information regarding the APIs go to the documentation section of our website.

0