Using Excel Power Query to analyse data from Our World In Data

Our World In Data (OWID) has been doing a hero’s job of collating the world’s covid vaccine distribution and administration data. They generate additional metrics and allow site visitors to analyse them in awesome interactive visualizations.

However, I wanted to do some analysis that went beyond what was possible on the site. No problem, because the OWID team also provides the data as csv files on a Github page.

These csv files can be easily downloaded from Github and you are then able to use any data transformation or visualization tool to do any analysis you need.

However if you need to to do frequent analysis because the vaccine data is updated often twice per day you will want to be able to dynamically refresh the data without having to download updated csv files from Github.

There are many tools and methods that can meet the requirement for  automated dynamic data refresh, but in this blog post I want to quickly highlight how Excel users can use the Excel Power Query Get Data->From Other Sources->From Web feature to link to the csv files in Github. After setting up the link, it only takes a simple click to refresh to get latest csv data from OWID’s Github repository.

First you need to get the url to the csv file on Github.

The easiest way to get correct url is to open the csv file in Github in Raw format by clicking on the file name and then clicking on the Raw button on top right of page. This will open the file in native “raw” format. Then you can simply copy the url from browser location bar which is what you want.

However, you can also manually create a url to any Github repository file as follows below.

Let’s use this url to the vaccinations.csv file in OWID’s Github repository below.

https://raw.githubusercontent.com/owid/COVID-19-data/master/public/data/vaccinations/vaccinations.csv

The url can be separated into the following parts:

    • Base url: https://raw.githubusercontent.com
    • Repository owner: owid
    • Repository name: COVID-19-data
    • Repository branch: master
    • Repository base element: public
    • Repository folders: data/vaccinations
    • Repository file name: vaccinations.csv

You should be able to manually create your own url using these components for any file in any Github repository.

Once you have your url, you can test if it is correct by simply copy pasting it into a browser location bar and verifying the url is good and retrieves the data you are looking for.

Once you have verified the url is correct, simply go to your Excel file and use the Excel Power Query Get Data feature and select From Other Sources and then From Web options and paste your url into the dialogue space and continue to load the file.

Once the query has been saved and data is loaded you can simply click and refresh it anytime to  retrieve the most recent csv file from the OWID Github repository!

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.