Good Contents Are Everywhere, But Here, We Deliver The Best of The Best.Please Hold on!
Your address will show here +12 34 56 78

Screen Shot 2017-04-25 at 19.38.39

 

 

 

 

 

 

 

 

 

2017 looks set to be a big year for us here on Ormond Quay – with AYLIEN in hyper-growth mode, we’ve added six new team members in the first four months of the year, and that looks set to continue. After this period of quick growth, we thought we’d take stock and introduce you to the newest recruits.

Say hello to our newest recruits!

Mahdi

Mahdi: NLP Research Engineer

Mahdi became an open-source contributor at age 16, working on Firefox Developer Tools and other projects you can find on his GitHub. At 18, he was hired as a full-stack developer to work on browser extensions and mobile apps. Mahdi just started at AYLIEN as a Natural Language Processing Research Engineer focusing on Deep Learning, while also working as a full-stack developer on our web apps. He blogs about programming (and life in general) on theread.me.

Mahdi is a serious outdoorsman who can be found hiking in the hills and practicing Primitive Living. He also loves learning languages and reading, which provides him with the raw material to fill our Slack loading messages with some supremely inspirational quotes!

Demian

Demian: NLP Research Intern

Demian comes from Braunschweig in Central Germany and has completed a degree in Computational Linguistics in the University of Heidelberg. As part of his degree he studied NLP and Artificial Intelligence in information extraction, and he is already familiar with Dublin from an Erasmus year spent in Trinity College. Demian previously worked in the Forensic Department of PwC in Germany, and here at AYLIEN he is going to research document summarization and event extraction for our News API.

Besides being a proficient coder, Demian is an avid painter and reader, and can be found running in Dublin’s parks.

Sylver

Sylver: Data Management Intern

Growing up between Dublin and Seattle, Sylver swapped one rainy town with a thriving tech scene for another. She is currently studying Legal Practice and Procedures, and before starting with us here at AYLIEN she was an editor of everything from novels to academic papers. Here at AYLIEN Sylver works on maintaining and managing our datasets and models.

A previous owner of 10 snakes (at the same time), Sylver spends her spare time caring for exotic pets, and is interested in reading, alternative modelling, and fitness.

Hosein

Hosein: Web Designer

Hosein is a creative designer with 3 years experience in UI design and front-end development, having previously worked with other startups and tech companies. A newcomer to NLP, Hosein is designing the AYLIEN website and web apps, and is also developing our front-ends.

While he’s away from his laptop, Hosein is usually out taking photographs and finding out more about cameras and photography.

Erfan

Erfan: NLP Research Engineer

Erfan holds a Bachelor’s Degree in Software Engineering. He has been researching computer vision for three years and you can read about his research on his blog. For his thesis, he used Deep Neural Nets to study the joint embedding of image and text, and at AYLIEN he is going to research and work on using memory-augmented neural nets, focusing on question-answering.

Will

Will: Content Marketing Intern

From the comparatively less exotic background of Dublin, Will is a Classics graduate who completed a Master’s in Digital Humanities at Trinity College, where he was introduced to NLP when he tried to write some code to index where authors use Latin words across English Literature. At AYLIEN, he is joining the Sales, Marketing, and Customer Success team to bolster our content creation and distribution efforts, and is even writing this exact sentence at this very moment in time.

Outside of AYLIEN, Will is an avid reader and learner of languages, and when he’s outside, he can be found running or hiking.

Come work with us!

So that sums up our new recruits – a pretty diverse group who all gravitated towards languages and programming. If you think you’d like to join us, take a look at aylien.com/jobs, email us at jobs@aylien.com, or call in for a fresh cup of coffee. We’re always interested in talking to anyone working on or studying NLP, Computational Linguistics or Machine Learning.




News API - Sign up




0

Introduction

In this post, AYLIEN NLP Research Intern, Mahdi, talks us through a quick experiment he performed on the back of reading an interesting paper on evolution strategies, by Tim Salimans, Jonathan Ho, Xi Chen and Ilya Sutskever.

Having recently read Evolution Strategies as a Scalable Alternative to Reinforcement Learning, Mahdi wanted to run an experiment of his own using Evolution Strategies. Flappy Bird has always been among Mahdi’s favorites when it comes to game experiments. A simple yet challenging game, he decided to put theory into practice.

Training Process

The model is trained using Evolution Strategies, which in simple terms works like this:

  1. Create a random, initial brain for the bird (this is the neural network, with 300 neurons in our case)
  2. At every epoch, create a batch of modifications to the bird’s brain (also called “mutations”)
  3. Play the game using each modified brain and calculate the final reward
  4. Update the brain by pushing it towards the mutated brains, proportionate to their relative success in the batch (the more reward a brain has been able to collect during a game, the more it contributes to the update)
  5. Repeat steps 2-4 until a local maximum for rewards is reached.

At the beginning of training, the bird usually either drops too low or jumps too high and hits one of the boundary walls, therefore losing immediately with a score of zero. In order to avoid scores of zero in training, which would means there won’t be a measure of success among brains, Mahdi set a small 0.1 score for every frame the bird stays alive. This way the bird learns to avoid dying at the first attempt. He then set a score of 10 for passing each wall, so the bird tries not only to stay alive, but to pass as many walls as possible.

The training process is quite fast as there is no need for backpropagation, and it is also not very costly in terms of memory as there is no need to record actions, as it is in policy gradients.

The model learns to play pretty well after 3000 epochs, however it is not completely flawless and it rarely loses in difficult cases, such as when there is a high difference between two wall entrances.

Here is a demonstration of the model after 3000 epochs

(~5 minutes on an Intel(R) Core(TM) i7-4770HQ CPU @ 2.20GHz):


Use the controls to set speed level or to restart

Web version

For ease of access, Mahdi has created a web version of the experiment which can be accessed here.

Try it yourself

Note: You need python3 and pip for installing and running the code.

First, download or clone the repository:

git clone https://github.com/mdibaiee/flappy-es.git


Next, install dependencies (you may want to create a virtualenv):

pip install -r requirements

The pretrained parameters are in a file named load.npy and will be loaded when you run train.py or demo.py

train.py will train the model, saving the parameters to saves/<TIMESTAMP>/save-<ITERATION>.

demo.py shows the game in a GTK window so you can see how the AI actually plays.


play.py if you feel like playing the game yourself, space: jump, once lost, press enter to play again.

Notes

It seems that training past a maximum point leads to a reduction in performance. Learning rate decay might help with this. Mahdi’s interpretation is that after finding a local maximum for accumulated reward and being able to receive high rewards, the updates become pretty large and will pull the model too much to different sides, thus the model will enter a state of oscillation.

To try it yourself, there is a long.npy file, rename it to load.npy (backup load.npy before doing so) and run demo.py, you will see the bird failing more often than not. long.py was trained for only 100 more epochs than load.npy.

0

Introduction

Welcome to the second installment in our series of monthly posts where we’ll be showcasing our News API by looking back at online news stories, articles and blog posts to uncover emerging insights and trends from topical categories.

For our February review, we looked at three IAB categories: Arts & Entertainment, Science and Politics.

For March, we’ve decided to narrow our focus a little further by looking at IAB subcategories to give you an idea of just how specific and granular you can be when sourcing and analyzing content through the News API. With this in mind, we’ve gone with the following three subcategories:

  1. Cell phones (subcategory of Tech & Computing)
  2. Boxing (subcategory of Sports)
  3. Stocks (subcategory of Personal FInance)

and for each subcategory we have performed the following analysis;

  • Publication volumes over time
  • Top stories
  • Most mentioned topics
  • Most shared stories on social media

Try it yourself

We’ve included code snippets for each of the analyses above so you can follow along or modify to create your own search queries.

If you haven’t already signed up to our News API you can do so here with a free 14 day News API trial.

1. Cell phones

The graph below shows publication volumes in the Cell phones subcategory throughout the month of March 2017.

Note: All visualizations are interactive. SImply hover your cursor over each to explore the various data points and information.

Volume of stories published: Cell phones

From the graph above we can see a number of spikes indicating sharp rises in publication volumes. Let’s take a look at the top 3;

Top stories

The three stories that contributed to the biggest spikes in news publication volumes;

  1. Samsung release their latest flagship phone, the Galaxy S8.
  2. The UK introduces a loss-of-license punishment for new drivers caught using their cell phones while driving.
  3. HTC reveal a limited edition version of their U Ultra smart phone.

It will perhaps come as no surprise to see one of the world’s top smartphone manufacturers, Samsung, getting the most media attention with the launch of their latest flagship model. In comparison, rivals HTC failed to generate the same level of hype around their latest model. However, by releasing a teaser about a surprise product release on March 15 they still managed to generate two of the top four publication volume spikes within the cell phone category in March.

Try it yourself – here’s the query we used for volume by category

Read more: We looked at Samsung’s recent exploding battery crisis to highlight how news content can be analyzed to track the voice of the customer in relation to crisis prevention and damage limitation.

Most mentioned topics

From the 7,000+ articles we sourced from the Cell phones category in March we looked at the most mentioned topics;

Try it yourself – here’s the query we used for most mentioned topics

Most shared on social media

What were the most shared stories on social media? We analyzed share counts from Facebook, LinkedIn and Reddit to see what type of content is performing best on each channel.

Facebook

  1. Man dies charging iPhone while in the bath (BBC. 26,072 shares)
  2. US bans electronic devices on flights from eight Muslim countries (The Independent. 25,886 shares)

Linkedin

  1. Samsung tries to reclaim its reputation with the Galaxy S8 (Washington Post. 890 shares)
  2. It’s Possible to Hack a Phone With Sound Waves, Researchers Show (NY Times. 814 shares)

Reddit

  1. Samsung confirms the Note 7 is coming back as a refurbished device (The Verge. 7,193 votes)
  2. The Galaxy S8 will be Samsung’s biggest test ever (The Verge. 4,981 votes)

Try it yourself – here’s the query we used for social shares

2. Boxing

We sourced a total of 9,000+ articles categorized under Boxing and found that what goes on outside the ring can garner just as much (if not more) media interest than what happens in it.

Volume of stories published: Boxing

Top stories

The three stories that contributed to the biggest spikes in news publication volumes;

  1. Heavyweight bout between David Haye and Tony Bellew.
  2. Floyd Mayweather urges the UFC to allow him and Conor McGregor to fight.
  3. Middleweight bout between Gennady Golovkin and Daniel Jacobs.

The two biggest fights in world boxing during the month of March are clearly represented by publication spikes in the chart above, particularly the heavyweight clash between Haye and Bellew. However, and as we mentioned, it’s not all about what happens in the ring.

The second largest spike we see above was the result of Floyd Mayweather, who hasn’t fought since September 2015, pleading with the UFC to allow a ‘superfight’ with Conor McGregor to go ahead. Neither Mayweather or McGregor have competed recently, nor have they any future fights scheduled, yet they still find themselves as the two most discussed individuals in this category. The bubble chart below showing the most mentioned topics from the boxing category further highlights this.

Most mentioned topics

Most shared on social media

Facebook

  1. Floyd Mayweather ‘officially out of retirement for Conor McGregor’ fight (FOX Sports. 56,951 shares)
  2. Bad refs, greedy NBF officials frustrating boxers – Apochi (Punchng. 42,367 shares)

Linkedin

  1. David Haye has Achilles surgery after Tony Bellew defeat (BBC. 234 shares)
  2. David Haye rules out retirement as he targets Tony Bellew rematch (BBC. 130 shares)

Reddit

  1. Teenage kickboxer dies after Leeds title fight (BBC. 1,502 shares)
  2. Muhammad Ali family vows to fight Trump’s ‘Muslim ban’ after airport detention (The Independent. 1,147 shares)

3. Stocks

The graph below shows publication volumes in the Stocks subcategory throughout the month of March 2017. In total we collected just over 30,000 articles.

Volume of stories published: Stocks

Top stories

The three stories that contributed to the biggest spikes in news publication volumes;

  1. Retailer Target sees stock drop by 13.5% after consumers boycott their pro-transgender stance.
  2. The US Federal Reserve increases interest rates, adding further pressure to housing market.
  3. Oil drops below US$53 as report shows rising US crude stockpiles

Most mentioned locations

Rather than focusing solely on extracted topics for this category, we thought it would be interesting to separate mentions of both locations and organizations. The chart above shows the most mentioned locations from all 30,000 articles published under the Stocks subcategory in March:

Most mentioned organizations

The chart above shows the top mentioned organizations including well known banks, investment firms and sources. It is interesting to see the likes of Facebook, Twitter and Snapchat in the mix also.

In March we saw Barclays declare Facebook as “the stock to town for the golden age of mobile”, referring to the upcoming 3-5 year period. Earlier in the month, Snapchat closed their first day of public trading up 44% at $24.48 a share.

Most shared on social media

Facebook

  1. Trump’s Approval Rating Hits New Record Low (Slate. 39,582 shares)
  2. Target Retailer Hits $15 Billion Loss Since Pro-Transgender Announcement (Breitbart. 30,107 shares)

Linkedin

  1. How on earth did India come up with these GDP numbers? (QZ. 2,579 shares)
  2. Home Prices in 20 U.S. Cities Rise at Fastest Pace Since 2014 (Bloomberg. 1,601 shares)

Reddit

  1. Bernie Sanders and Planned Parenthood are the most popular things in America, Fox News finds (The Week. 28,075 votes)
  2. GameStop Is Going to Close at Least 150 Stores (Fortune. 4,982 votes)

Conclusion

We hope that this post has given you an idea of the kind of in-depth and precise analyses that our News API users are performing to source and analyze specific news content that is of interest to them.

Ready to try the News API for yourself? Simply click the image below to sign up for a 14-day free trial.





News API - Sign up




0