Product

Build an Automated Sentiment Analysis Tool for Twitter with one Python Script

In a previous blog, we showed you how easy it is to set up a simple social listening tool to monitor chatter on Twitter. We showed you how using a single Python script you can gather  recent Tweets about a topic that interested you, analyze the sentiment of these tweets, and produce a visualization showing the sentiment.

But as we also pointed out in that blog, Twitter users post 350,000 new Tweets every minute, giving us an incredibly live dataset that will contain new information every time we query it. In fact, Twitter is so responsive to emerging trends that the US Geological Survey uses it to detect earthquakes – because the rate at which people Tweet about these earthquakes outpaces even their own data pipelines of geological and seismic data.

So if we can rely on Twitter users to Tweet about earthquakes during the actual earthquakes, we can absolutely rely on them to Tweet their opinions about subjects important to you, in real time. So while a single sentiment analysis gives you a snapshot of what people are saying at a single moment in time, it’s even more useful to analyze sentiment on a regular basis in order to understand how public opinion or even better your customers opinions can change over time

Why is having an automated sentiment analysis workflow useful?

There are two main reasons.

  • It will allow you to keep abreast of any trends in consumer sentiment shown towards you, your products, or your competitors.
  • Over time, this workflow will build up an extremely valuable longitudinal dataset that you can compare with sales trends, website traffic, or any of your KPIs.

In this blog, we’re going to show you how you can turn the original script we shared into a fully automated tool that will run your analysis at a given time everyday or however frequently you schedule it to run. It will gather Tweets, analyze their sentiment, and if you want it to, it will produce a temporary visualization so you can read the data quickly. It will also gradually build up a record of the Tweets and their sentiment by adding each day’s result to the same CSV file.

 

The 4 Steps for setting up your Twitter Sentiment Analysis Tool (in 20 mins)

This blog is split up into 4 steps, which all together should take about 20 minutes to complete:

  1. Get your credentials from Twitter and AYLIEN – both of these are free. (at 10 mins this is  the most time consuming part)
  2. Set up a folder and copy the Python script into it (2 mins)
  3. Run the Python script for your first sentiment analysis (2 mins)
  4. Schedule the script to run every day – we’ve included a detailed guide for both Windows and Mac below. (5 mins)

 

Step 1: Getting your credentials

If you completed the last blog, you can skip this part, but if you didn’t, follow these three steps:

  1. Make sure you have the following libraries installed (which you can do with pip):
  1. Get API keys for Twitter:
  • Getting the API keys from Twitter Developer (which you can do here) is the most time consuming part of this process, but this video can help you if you get lost.
  • What it costs & what you get: the free Twitter plan lets you download 100 Tweets per search, and you can search Tweets from the previous seven days. If you want to upgrade from either of these limits, you’ll need to pay for the Enterprise plan ($$)
  1. Get API keys for AYLIEN:
  • To do the sentiment analysis, you’ll need to sign up for our Text API’s free plan and grab your API keys, which you can do here.
  • What it costs & what you get: the free Text API plan lets you analyze 30,000 pieces of text per month (1,000 per day). If you want to make more than 1,000 calls per day, our Micro plan lets you analyze 80,000 pieces of text for ($49/month)

 

Step 2: Set up a folder and copy the Python script into it

Setting up a folder for this project will make everything a lot tidier and easier in the long run. So create a new one, and copy the Python script below into it. After you run this script for the first time, it will create a CSV in the folder, which is where it will store the Tweets and their sentiment every time it runs.

Here is the Python script:


import os
import sys
import csv
import tweepy
import matplotlib.pyplot as plt

from collections import Counter
from aylienapiclient import textapi

open_kwargs = {}

if sys.version_info[0] < 3:
    input = raw_input
else:
    open_kwargs = {'newline': ''}



# Twitter credentials
consumer_key = "Your consumer key here"
consumer_secret = "your secret consumer key here"
access_token = "your access token here"
access_token_secret = "your secret access token here"

# AYLIEN credentials
application_id = "your app id here"
application_key = "your app key here"

# set up an instance of Tweepy
auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_token_secret)
api = tweepy.API(auth)

# set up an instance of the AYLIEN Text API
client = textapi.Client(application_id, application_key)

#here insert a few lines to open the previous CSV file and read the last entry for id
max_id = 0

file_name = 'Sentiment_Analysis_of_Tweets_About_Your_query.csv'

if os.path.exists(file_name):
    with open(file_name, 'r') as f:
        for row in csv.DictReader(f):
            max_id = row['Tweet_ID']
else:
	    with open(file_name, 'w', **open_kwargs) as f:
	        csv.writer(f).writerow([
	                                "Tweet_ID",
	                                "Time",
	                                "Tweet",
	                                "Sentiment"])

results = api.search(
    lang="en",
    q="Your_query -rt",
    result_type="recent",
    count=10,
    since_id=max_id
)

results = sorted(results, key=lambda x: x.id)

print("--- Gathered Tweets \n")

# open a csv file to store the Tweets and their sentiment
with open(file_name, 'a', **open_kwargs) as csvfile:
    csv_writer = csv.DictWriter(
        f=csvfile,
        fieldnames=[
                    "Tweet_ID",
                    "Time",
                    "Tweet",
                    "Sentiment"]
    )

    print("--- Opened a CSV file to store the results of your sentiment analysis... \n")

    # tidy up the Tweets and send each to the AYLIEN Text API
    for c, result in enumerate(results, start=1):
        tweet = result.text
        tidy_tweet = tweet.strip().encode('ascii', 'ignore')
        tweet_time = result.created_at
        tweet_id = result.id

        if not tweet:
            print('Empty Tweet')
            continue

        response = client.Sentiment({'text': tidy_tweet})
        csv_writer.writerow({
        	"Tweet_ID": tweet_id,
        	"Time": tweet_time,
            'Tweet': response['text'],
            'Sentiment': response['polarity'],
        })

        print("Analyzed Tweet {}".format(c))

# count the data in the Sentiment column of the CSV file
with open(file_name, 'r') as data:
    counter = Counter()
    for row in csv.DictReader(data):
        counter[row['Sentiment']] += 1

    positive = counter['positive']
    negative = counter['negative']
    neutral = counter['neutral']

# declare the variables for the pie chart, using the Counter variables for "sizes"
colors = ['green', 'red', 'grey']
sizes = [positive, negative, neutral]
labels = 'Positive', 'Negative', 'Neutral'

# use matplotlib to plot the chart
plt.pie(
    x=sizes,
    shadow=True,
    colors=colors,
    labels=labels,
    startangle=90
)

plt.title("Sentiment of {} Tweets about Your Subject".format(sum(counter.values())))
plt.show()

If you want any part of that script explained, our previous blog breaks it up into pieces and explains each one.

Step 3: Run the Python script for your first sentiment analysis

Before you run the script above, you’ll need to make two simple changes. First, enter in your access keys from Twitter and AYLIEN. Second, don’t forget to enter in what it is your want to analyze! You’ll need to do this in two places first, on line 40 you should change the name of the CSV file that the script is going to create (currently it’s file_name = ‘Sentiment_Analysis_of_Tweets_About_Your_query.csv’). Second, on line 56, replace the text “Your Query” with whatever your query is.

Also, if you don’t feel that you need a daily report based and you’re more interested in building up a record of sentiment analysis, just delete everything below “## count the data in the Sentiment column of the CSV file”. This will carry out the sentiment analysis every day and add the results to the CSV file,  but won’t show a visualization.

After you’ve run this script, your folder should contain the Python script and the CSV file.

Step 4: Schedule the Python script to run every day.

So now that you’ve got your Python script saved in a folder along with a CSV file containing the results of your first sentiment analysis, we’re ready for the final step – scheduling the script to run to a schedule that suits you.

Depending on whether you use Windows or Mac, there are different steps to take here, but you don’t need to install anything either way. Mac users will use Cron, whereas Windows users can use Task Scheduler.

Windows:

  1. Open Task Scheduler (search for it in the Start menu)
  2. Click into Task Scheduler Library on the left of your screen
  3. Open the ‘Create Task’ box on the right of your screen
  4. Give your task a name
  5. Add a new Trigger: select ‘daily’ and enter the time of day you want your computer to run the script. Remember to select a time of day that your computer is likely to be running.
  6. Open up the Actions tab and click New
  7. In the box that opens up, make sure “Start a program” is selected
  8. In the “Program/Script” box, enter the path to your python.exe file and make sure this is enclosed in quotation marks (so something like “C:\Program Files (x86)\Python36-32\python.exe”)
  9. In the “Add arguments” box, enter the path to the dailySentiment.py file, including the file itself (so something like C:\Users\Yourname\desktop\folder\dailySentiment.py). No quotation marks are needed here.
  10. In the “Start in” box, enter the path to the containing folder with your script and CSV file. (C:\Users\Yourname\desktop or wherever your folder is\the name of your folder\). Again, no quotation marks are needed.
  11. You’re done!

Mac:

  1. Open up a Terminal
  2. Type “crontab – e” to create a new Cron job.
  3. Scheduling the Cron job takes one line of code, that is split into three parts.
  4. First, type the time of the week you want your script to run, according to the Cron format, which is “minute hour date month weekday” all in integers, all separated by single spaces. For example, if you want your script to gather Tweets every day at 9AM, this first part of the line of code will read “ 0 9 * * * ”   – minute zero of hour nine, every day of every month.
  5. Second, leave a space after this first line and type in the location of your Python executable file. This part will usually read something like “System/Library/Frameworks/Python.framework/Python”
  6. Finally, enter the path to the Python script in your folder. For example, if you saved the scripts to a folder on your desktop, the path will be something like “Users/Your name/Desktop/folder name/dailySentiment.py”
  7. The full line of code in your Terminal will look something like “0 9 * * * /System/Library/Frameworks/Python.framework/Python Users/Your name/Desktop/folder name/dailySentiment.py”.
  8. Now hit Escape, then type “:wq”, and hot Enter.
  9. To double check that your Cron job is scheduled, type “crontab -l” and you should see your job listed.

If you run into trouble, get in touch!

With those four steps, your automated workflow should be up and running, but depending on how your system is set up, you could run into an error along the way. If you do, don’t hesitate to get in touch by sending us an email, leaving a comment, or chatting with us on our site.

Happy analyzing!






Text Analysis API - Sign up




Author


Avatar

Will Gannon

Marketing @ AYLIEN A Classics graduate from UCD, Will handles Inbound Marketing here at AYLIEN. Before joining us, Will completed a Master’s in Digital Humanities at Trinity College, where he used NLP methods to index where Latin terms appear in English Literature.