Canadian TCS FDI Officers Twitter list member analysis

Blog post updated to add the following:

    • Updated Python “TCS Members Details” code to get additional information from the List members’ profiles.
    • New Python code “TCS Members Tweets” to download all List members’ Tweets.

————————————————–

The Canadian Trade Commissioner Service maintains a Twitter List named CDN TCS FDI Officers that has a bunch of Canadian Trade Commissioners as members. Investment Canada also has a similar list.

I was interested to learn how many people were on these lists, how long they had been on the lists, what they were Tweeting about.  To undertake this analysis, I used the Twitter API and Python Tweepy to retrieve data about list members including:

    • screen_name
    • name
    • followers_count
    • friends_count
    • statuses_count (# of tweets)
    • favourites_count
    • created_at (Twitter account create date)
    • account_age_days (calculated as # days from July 9, 2016)
    • listed_count
    • verified
    • profile_image_url
    • profile_sidebar_fill_color
    • profile_text_color
    • profile_image_url_https
    • profile_use_background_image
    • default_profile_image
    • profile_sidebar_border_color
    • profile_background_color
    • profile_link_color

You can get find complete definitions about these fields over at Twitter.

The data was output as a simple csv file that I used to create a Tableau Public dashboard which is embedded below. The csv file is attached here.

The Tableau dashboard is interactive and can be sorted by any of the columns by using the sort icon which appears when you hover over the top of column as illustrated in screenshot below.

tcs_click_to_sort

 

 

 

 

It would be interesting to try to determine if this Twitter activity has measurable impacts on FDI and Canadian Trade.  For example perhaps foreign investment finds it way to Canada after reading Tweet by one of our Trade Commissioners.

This would require that TCS maintains a CRM (client relationship manager) system and process that records lead sources.

There is some disparity between use of Twitter by the CDN TCS FDI Officers list members as shown by Tweets/Day which total Tweets divided by # days since the account was created. If there is a measurable lift in lead generation by Twitter use then this would be actionable metric.

For the technically minded the Python code is shown below. Note that you need an API account to use with this code.

There is another file tweet_auth import not shown here that contains Twitter OAuth credentials that looks like the following. You just have to replace the ‘xxx…’ with your credentials.

Here is the “TCS Member Details” code:

 

 #twitter api oauth credentials - replace with yours
consumer_key = 'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx'
consumer_secret = 'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx'
access_token = 'xxxxxxxxxxxxx-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx'
access_token_secret = 'xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx'
  import sys, os
    import csv
    from datetime import datetime, date
    import tweepy
    from dateutil import tz

    ## get twitter auth key file
    sys.path.insert(0, 'path/to/folder/with/your/twitter_creds/')
    from ppcc_ca_app_key import keys

    ## this is consumer key and secret from the ppcc-ca app
    consumer_key = keys['consumer_key']
    consumer_secret = keys['consumer_secret']

    ## don't need to access token bc not tweeting on this timeline, just reading #access_token = keys['access_token']
    #access_token_secret = keys['access_token_secret']

    ## get twitter auth
     #auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
    auth = tweepy.AppAuthHandler(consumer_key, consumer_secret)
    #auth.set_access_token(access_token, access_token_secret)
    a
    pi = tweepy.API(auth)

    today = datetime.now().date()
            
    tcs_list_members = tweepy.Cursor(api.list_members, 'invest_canada', 'cdn-tcs-fdi-officers')

    member_details_csv = csv.writer(open('tcs_member_details.csv', 'wb'))
    member_details_csv.writerow([
        'screen_name',
        'name',
        'followers_count',
        'friends_count',
        'statuses_count',
        'favourites_count',
        'created_at',
        'account_age_days',
        'time_zone',
        'listed_count',
        'profile_image_url',
        'profile_sidebar_fill_color',
        'profile_text_color',
        'profile_image_url_https',
        'profile_use_background_image',
        'default_profile_image',
        'verified',
        'profile_sidebar_border_color',
        'profile_background_color',
        'profile_link_color'])

    members = []
    member_tweets = []
        
    for member in tcs_list_members.items():
        
        member_details_csv.writerow([
            member.screen_name.encode('utf-8', 'ignore'),
            member.name.encode('utf-8', 'ignore'),
            member.followers_count,
            member.friends_count,
            member.statuses_count,
            member.favourites_count,
            member.created_at,
            (today-member.created_at.date()).days,
            member.time_zone,
            member.listed_count,
            member.profile_image_url,
            member.profile_sidebar_fill_color,
            member.profile_text_color,
            member.profile_image_url_https,
            member.profile_use_background_image,
            member.default_profile_image,
            member.verified,
            member.profile_sidebar_border_color,
            member.profile_background_color,
            member.profile_link_color
            ])
            

It would be interesting to see what Tweet topics, other Twitter user mentions, links to webpages, etc. So the next step is to loop through each of the list member’s api.user_timeline to retrieve their Tweet content and do some analysis on them.  For now here is the code and some analysis and visualization in Tableau later.

Here is the “TCS Members’ Tweets” Python code:

    import sys, os
    import csv
    from datetime import datetime, date
    import tweepy
    from dateutil import tz

    ## get twitter auth key file
    sys.path.insert(0, '/path/to/your/folder/with/twitter_creds/')
    from ppcc_ca_app_key import keys

    ## this is consumer key and secret from the ppcc-ca app
    consumer_key = keys['consumer_key']
    consumer_secret = keys['consumer_secret']

    ## don't need to access token bc not tweeting on this timeline, just reading
    #access_token = keys['access_token']
    #access_token_secret = keys['access_token_secret']

    ## get twitter auth
    #auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
    auth = tweepy.AppAuthHandler(consumer_key, consumer_secret)
    #auth.set_access_token(access_token, access_token_secret)
    #api = tweepy.API(auth)

    api = tweepy.API(auth, wait_on_rate_limit=True, wait_on_rate_limit_notify=True)

    today = datetime.now().date()

    def get_list_member_tweets():
        ## get invest_canada and cdn-tcs-fdi-officers list members
        tcs_list_members = tweepy.Cursor(api.list_members, 'invest_canada', 'cdn-tcs-fdi-officers')
      
        ## open csv file 
        member_tweets_csv = csv.writer(open('tcs_member_tweets.csv', 'wb'))
        
        ## write header row column names
        member_tweets_csv.writerow([
            'status_id',
            'date_time_UTC',
            'screen_name',
            'hashtags',
            'user_mentions',
            'tweet_text'
            ])
        
        ## loop through list members and get their tweets
        for member in tcs_list_members.items():
            ## get list member tweets
            member_tweets = get_member_tweets(member.screen_name)
            
            for status in member_tweets:
                
                ## check tweets for hashtags
                if status.entities['hashtags']:
                        hastags=[]
                        for hashtag in status.entities['hashtags']:
                            hastags.append(hashtag['text'].encode('utf8','ignore'))
                
                ## check tweets for user_mentions
                if status.entities['user_mentions']:
                        user_mentions=[]
                        for user_mention in status.entities['user_mentions']:
                            user_mentions.append(user_mention['screen_name'].encode('utf8','ignore'))
                
                ## write to csv file      
                member_tweets_csv.writerow([
                    status.id,
                    str(status.created_at.replace(tzinfo=tz.gettz('UTC')).astimezone(tz.gettz('America/Los_Angeles')).replace(tzinfo=None)),
                    member.screen_name,
                    hastags,
                    user_mentions,
                    status.text.replace('\n',' ').replace('\r',' ').encode('utf8','ignore')
                    ])

    def get_member_tweets(screen_name):
        ## get list member's tweets
        
        alltweets = []

        ## can only get max 200 tweets at a time
        new_tweets = api.user_timeline(screen_name = screen_name, count=200)
        alltweets.extend(new_tweets)

        ## get oldest tweet already retrieved
        oldest = alltweets[-1].id - 1

        ## iteratively get remaining tweets
        while len(new_tweets) > 0:
            new_tweets = api.user_timeline(screen_name = screen_name, count=200, max_id=oldest)
            alltweets.extend(new_tweets)
            oldest = alltweets[-1].id - 1
        
        ## print out member and # tweets retrieved to show progress
        print screen_name + " %s tweets downloaded" % (len(alltweets))
        
        ## return all tweets
        return alltweets

    if __name__ == '__main__':
        get_list_member_tweets()

Full code on Github:

https://github.com/sitrucp/tcs_fdi_list_twitter_data

Leave a Comment

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.