Product

Text Analysis and Ruby: Getting Started with AYLIEN Text Analysis API and Ruby

The 4th edition of our “Getting up and running with AYLIEN Text Analysis API” blog series will focus on working with the API using Ruby. Previously we published code snippets and getting started guides for node.js, Python and Java.

Similar to our previous blogs, we’re going to perform some basic Text Analysis processes like, detecting what language a piece of text is written in, analyzing the sentiment of a piece of text, classifying an article and finally generating some hashtags for a URL in order to showcase how easy it is to get started in your chosen language.

We’re first going to look at the code in action. We’ll then go through the code, section by section, to investigate each of the endpoints used in the code snippet.

We are going to do the following:

  • Detect what language the following text is written in: “What language is this sentence written in?”
  • Analyze the sentiment of the following statement: “John is a very good football player!”
  • Generate a classification (IPTC Code and label) for the URL: “http://www.bbc.com/news/science-environment-30177534”
  • Generate hashtags for the URL: “http://www.bbc.com/news/science-environment-30177534”

Note: The getting started page on our website has a range of code snippets in other programming languages for you to try.

Overview of the code in action

The complete code snippet is given at the end of this blog for you to copy and paste. To run it, open a text editor of your choice and copy and paste the snippet. Before running the code, ensure you replace the YOUR_APP_ID and YOUR_APP_KEY placeholders in the code with your own application id and application key. You should have been sent this when you signed up as an API user. Make your way to our sign up page if you haven’t already signed up.

Save the file as TextAPISample.rb and then open command prompt. Navigate to the folder where you saved the code snippet and run the code by typing “ruby TextAPISample.rb”.

Note:You will need to have ruby installed to run this example, you can download it here, if you haven’t already done so.

Once you run it, you should receive the following output:


C:srcruby>ruby textapisample.rb

Text:   What language is this sentence written in?
Language: en (0.9999934427379825)

Text:   John is a very good football player!
Sentiment: positive (0.9999988272764874)

Hashtags: ["#SeaIce", "#WoodsHoleOceanographicInstitution", "#BBCNews", "#WHOI",
 "#RaceAndEthnicityInTheUnitedStatesCensus", "#Transponder", "#SynopticScaleMete
orology", "#WilkesLand", "#WaterColumn", "#PolarIcePacks", "#Twitter", "#Hanuman
tSingh", "#BritishAntarcticSurvey", "#AutonomousUnderwaterVehicle", "#UK", "#BBC
Online", "#NatureGeoscience", "#PackIce", "#Sonar", "#Arctic", "#Antarctica", "#
UnitedKingdom", "#Scratching", "#Ecosystem", "#AustraliaGroup"]


Classification: [{"label"=>"natural resources - oceans", "code"=>"06006007", "co
nfidence"=>1.0}]

...

In this case we have detected that the first piece of text is written in English and the sentiment or polarity of the second statement is positive. We have also generated hashtags for the URL and classifed the content.

The detail above shows the code running in its entirety, but to highlight each feature/endpoint we will go through the code snippet, section by section.

Language Detection

Using the Language Detection endpoint you can analyze a piece of text or a URL and determine what language it is written in. In the piece of code we have used in this blog, the “parameter” variable controls whether the call is made specifying the text directly or as a URL.


parameters = {"text" => "What language is this sentence written in?"}
language = call_api("language", parameters)

In this case we have specified that it should analyze the following text “What language is this sentence written in?” and as you can see from the output below, it determined that the text is written in English and it gave a 0.999993 confidence score that the language was detected correctly. Note: For all of the endpoints, the API returns the text which was analysed for reference and we have included it in the results in each case.

Result:


Text:   What language is this sentence written in?
Language: en (0.9999934427379825)

Sentiment Analysis

Similarly, the Sentiment Analysis endpoint takes a piece of text or a URL and analyzes it to determine whether it is positive, negative or even neutral.


parameters = {"text" => "John is a very good football player!"}
sentiment = call_api("sentiment", parameters)

In this case, we have specified that it should analyze the text “John is a very good football player!”. The API has determined that the sentiment of the piece of text is positive, we can also be pretty sure it’s correct based on the confidence score returned of 0.999998.

Result:


Text:   John is a very good football player!
Sentiment: positive (0.9999988272764874)

Hashtag Suggestions

The Hashtag Suggestion endpoint, analyses a URL and generates a list of hashtag suggestions which can be used to ensure that your content or URL’s are optimally shared on social media:


parameters = {"url" => "http://www.bbc.com/news/science-environment-30177534"}
hashtags = call_api("hashtags", parameters)

For hashtag suggestions, we have used an article about measuring the thickness of the sea ice in the Antartic published on the BBC news website http://www.bbc.com/news/science-environment-30177534. The hashtag suggestion endpoint first extracts the text from the URL (which is returned for reference by the call and the start of which I have shown below) and then analyses that text and generates hashtag suggestions.

Result:


Hashtags: ["#SeaIce", "#WoodsHoleOceanographicInstitution", "#BBCNews", "#WHOI",
 "#RaceAndEthnicityInTheUnitedStatesCensus", "#Transponder", "#SynopticScaleMete
orology", "#WilkesLand", "#WaterColumn", "#PolarIcePacks", "#Twitter", "#Hanuman
tSingh", "#BritishAntarcticSurvey", "#AutonomousUnderwaterVehicle", "#UK", "#BBC
Online", "#NatureGeoscience", "#PackIce", "#Sonar", "#Arctic", "#Antarctica", "#
UnitedKingdom", "#Scratching", "#Ecosystem", "#AustraliaGroup"]

Text of website article pointed to by the url http://www.bbc.com/news/science-environment-30177534
Antarctic sub gauges sea ice thickness
A novel autonomous sub has acquired the first detailed, high-resolution 3D maps of Antarctic sea ice...

Article Classification

The classification endpoint automatically assigns or tags an article or piece of text to one or more categories making it easier to manage and sort. The classification endpoint is based on IPTC International Subject News Codes and can identify up to 500 categories.


parameters = {"url" => "http://www.bbc.com/news/science-environment-30177534"}
classify = call_api("classify", parameters)

When we pass the url pointing to the BBC news story, we receive the results as shown below. As you can see it has labelled the article as “natural resources – oceans” with a corresponding IPTC code of 06006007 and a confidence of 1.

Result:


Classification: [{"label"=>"natural resources - oceans", "code"=>"06006007", "confidence"=>1.0}]

For more getting started guides and code snippets to help you get up and running with our API, visit our Getting Started page on our website. If you haven’t already done so you can get free access to our API on our sign up page.

image

The Complete Code Snippet


require 'net/http'
require 'uri'
require 'json'

APPLICATION_ID = 'ec551f70'
APPLICATION_KEY = 'fe51b9cf561e233808b54598e7d82413'

def call_api(endpoint, parameters)
  url = URI.parse("https://api.aylien.com/api/v1/#{endpoint}")
  headers = {
      "Accept"                           =>   "application/json",
      "X-AYLIEN-TextAPI-Application-ID"  =>   APPLICATION_ID,
      "X-AYLIEN-TextAPI-Application-Key" =>   APPLICATION_KEY
  }

  http = Net::HTTP.new(url.host, url.port)
  http.use_ssl = true
  request = Net::HTTP::Post.new(url.request_uri)
  request.initialize_http_header(headers)
  request.set_form_data(parameters)

  response = http.request(request)

  JSON.parse response.body
end

parameters = {"text" => "John is a very good football player!"}
sentiment = call_api("sentiment", parameters)

parameters = {"text" => "What language is this sentence written in?"}
language = call_api("language", parameters)

parameters = {"url" => "http://www.bbc.com/news/science-environment-30177534"}
hashtags = call_api("hashtags", parameters)
classify = call_api("classify", parameters)


puts "n"
puts "Text:   #{language["text"]}"
puts "Language: #{language["lang"]} (#{language["confidence"]})"
puts "n"
puts "Text:   #{sentiment["text"]}"
puts "Sentiment: #{sentiment["polarity"]} (#{sentiment["polarity_confidence"]})"
puts "n"
puts "Hashtags: #{hashtags["hashtags"]}"
puts "n"
puts "n"
puts "Classification: #{classify["categories"]}"
puts "n"
puts "Text of website article pointed to by the url http://www.bbc.com/news/science-environment-30177534"
puts "n"
puts " #{hashtags["text"]}"





Text Analysis API - Sign up




Author


Avatar

Mike Waldron

Head of Marketing & Sales @ AYLIEN A legal convert with a masters degree from Smurfit Business School, Mike runs our Sales and Marketing at AYLIEN. Mike gathered his Sales and Marketing experience with technology companies in Sydney and Dublin before getting the startup itch and joining the team at AYLIEN. Twitter: @MikeWallly