Blog post updated to add the following:
-
- Updated Python “TCS Members Details” code to get additional information from the List members’ profiles.
- New Python code “TCS Members Tweets” to download all List members’ Tweets.
————————————————–
The Canadian Trade Commissioner Service maintains a Twitter List named CDN TCS FDI Officers that has a bunch of Canadian Trade Commissioners as members. Investment Canada also has a similar list.
I was interested to learn how many people were on these lists, how long they had been on the lists, what they were Tweeting about. To undertake this analysis, I used the Twitter API and Python Tweepy to retrieve data about list members including:
-
- screen_name
- name
- followers_count
- friends_count
- statuses_count (# of tweets)
- favourites_count
- created_at (Twitter account create date)
- account_age_days (calculated as # days from July 9, 2016)
- listed_count
- verified
- profile_image_url
- profile_sidebar_fill_color
- profile_text_color
- profile_image_url_https
- profile_use_background_image
- default_profile_image
- profile_sidebar_border_color
- profile_background_color
- profile_link_color
You can get find complete definitions about these fields over at Twitter.
The data was output as a simple csv file that I used to create a Tableau Public dashboard which is embedded below. The csv file is attached here.
The Tableau dashboard is interactive and can be sorted by any of the columns by using the sort icon which appears when you hover over the top of column as illustrated in screenshot below.
It would be interesting to try to determine if this Twitter activity has measurable impacts on FDI and Canadian Trade. For example perhaps foreign investment finds it way to Canada after reading Tweet by one of our Trade Commissioners.
This would require that TCS maintains a CRM (client relationship manager) system and process that records lead sources.
There is some disparity between use of Twitter by the CDN TCS FDI Officers list members as shown by Tweets/Day which total Tweets divided by # days since the account was created. If there is a measurable lift in lead generation by Twitter use then this would be actionable metric.
For the technically minded the Python code is shown below. Note that you need an API account to use with this code.
There is another file tweet_auth import not shown here that contains Twitter OAuth credentials that looks like the following. You just have to replace the ‘xxx…’ with your credentials.
Here is the “TCS Member Details” code:
#twitter api oauth credentials - replace with yours consumer_key = 'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx' consumer_secret = 'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx' access_token = 'xxxxxxxxxxxxx-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx' access_token_secret = 'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx' import sys, os import csv from datetime import datetime, date import tweepy from dateutil import tz ## get twitter auth key file sys.path.insert(0, 'path/to/folder/with/your/twitter_creds/') from ppcc_ca_app_key import keys ## this is consumer key and secret from the ppcc-ca app consumer_key = keys['consumer_key'] consumer_secret = keys['consumer_secret'] ## don't need to access token bc not tweeting on this timeline, just reading #access_token = keys['access_token'] #access_token_secret = keys['access_token_secret'] ## get twitter auth #auth = tweepy.OAuthHandler(consumer_key, consumer_secret) auth = tweepy.AppAuthHandler(consumer_key, consumer_secret) #auth.set_access_token(access_token, access_token_secret) a pi = tweepy.API(auth) today = datetime.now().date() tcs_list_members = tweepy.Cursor(api.list_members, 'invest_canada', 'cdn-tcs-fdi-officers') member_details_csv = csv.writer(open('tcs_member_details.csv', 'wb')) member_details_csv.writerow([ 'screen_name', 'name', 'followers_count', 'friends_count', 'statuses_count', 'favourites_count', 'created_at', 'account_age_days', 'time_zone', 'listed_count', 'profile_image_url', 'profile_sidebar_fill_color', 'profile_text_color', 'profile_image_url_https', 'profile_use_background_image', 'default_profile_image', 'verified', 'profile_sidebar_border_color', 'profile_background_color', 'profile_link_color']) members = [] member_tweets = [] for member in tcs_list_members.items(): member_details_csv.writerow([ member.screen_name.encode('utf-8', 'ignore'), member.name.encode('utf-8', 'ignore'), member.followers_count, member.friends_count, member.statuses_count, member.favourites_count, member.created_at, (today-member.created_at.date()).days, member.time_zone, member.listed_count, member.profile_image_url, member.profile_sidebar_fill_color, member.profile_text_color, member.profile_image_url_https, member.profile_use_background_image, member.default_profile_image, member.verified, member.profile_sidebar_border_color, member.profile_background_color, member.profile_link_color ])
It would be interesting to see what Tweet topics, other Twitter user mentions, links to webpages, etc. So the next step is to loop through each of the list member’s api.user_timeline to retrieve their Tweet content and do some analysis on them. For now here is the code and some analysis and visualization in Tableau later.
Here is the “TCS Members’ Tweets” Python code:
import sys, os import csv from datetime import datetime, date import tweepy from dateutil import tz ## get twitter auth key file sys.path.insert(0, '/path/to/your/folder/with/twitter_creds/') from ppcc_ca_app_key import keys ## this is consumer key and secret from the ppcc-ca app consumer_key = keys['consumer_key'] consumer_secret = keys['consumer_secret'] ## don't need to access token bc not tweeting on this timeline, just reading #access_token = keys['access_token'] #access_token_secret = keys['access_token_secret'] ## get twitter auth #auth = tweepy.OAuthHandler(consumer_key, consumer_secret) auth = tweepy.AppAuthHandler(consumer_key, consumer_secret) #auth.set_access_token(access_token, access_token_secret) #api = tweepy.API(auth) api = tweepy.API(auth, wait_on_rate_limit=True, wait_on_rate_limit_notify=True) today = datetime.now().date() def get_list_member_tweets(): ## get invest_canada and cdn-tcs-fdi-officers list members tcs_list_members = tweepy.Cursor(api.list_members, 'invest_canada', 'cdn-tcs-fdi-officers') ## open csv file member_tweets_csv = csv.writer(open('tcs_member_tweets.csv', 'wb')) ## write header row column names member_tweets_csv.writerow([ 'status_id', 'date_time_UTC', 'screen_name', 'hashtags', 'user_mentions', 'tweet_text' ]) ## loop through list members and get their tweets for member in tcs_list_members.items(): ## get list member tweets member_tweets = get_member_tweets(member.screen_name) for status in member_tweets: ## check tweets for hashtags if status.entities['hashtags']: hastags=[] for hashtag in status.entities['hashtags']: hastags.append(hashtag['text'].encode('utf8','ignore')) ## check tweets for user_mentions if status.entities['user_mentions']: user_mentions=[] for user_mention in status.entities['user_mentions']: user_mentions.append(user_mention['screen_name'].encode('utf8','ignore')) ## write to csv file member_tweets_csv.writerow([ status.id, str(status.created_at.replace(tzinfo=tz.gettz('UTC')).astimezone(tz.gettz('America/Los_Angeles')).replace(tzinfo=None)), member.screen_name, hastags, user_mentions, status.text.replace('\n',' ').replace('\r',' ').encode('utf8','ignore') ]) def get_member_tweets(screen_name): ## get list member's tweets alltweets = [] ## can only get max 200 tweets at a time new_tweets = api.user_timeline(screen_name = screen_name, count=200) alltweets.extend(new_tweets) ## get oldest tweet already retrieved oldest = alltweets[-1].id - 1 ## iteratively get remaining tweets while len(new_tweets) > 0: new_tweets = api.user_timeline(screen_name = screen_name, count=200, max_id=oldest) alltweets.extend(new_tweets) oldest = alltweets[-1].id - 1 ## print out member and # tweets retrieved to show progress print screen_name + " %s tweets downloaded" % (len(alltweets)) ## return all tweets return alltweets if __name__ == '__main__': get_list_member_tweets()
Full code on Github:
https://github.com/sitrucp/tcs_fdi_list_twitter_data