Use Excel Power Query to get data from Our World In Data

Our World In Data (OWID) has been doing a hero’s job of collating the world’s covid vaccine distribution and administration data. They generate additional metrics and allow site visitors to analyse them in awesome interactive visualizations.

However, I wanted to do some analysis that went beyond what was possible on the site. No problem, because the OWID team also provides the data as csv files in their Github repository.

These csv files are easily manually downloaded from Github to be used. But as the OWID vaccine data is updated regularly, often twice per day, you will probably want to automate the retrieval of updated csv files from Github.

There are many tools and methods to automate this but in this blog post I want to quickly highlight how Excel users can use Excel’s Power Query Get Data->From Other Sources->From Web feature to link to the csv files in Github. (Read my other blog post to learn more about Power Query.)

Once implemented you only need to click refresh to get latest csv data from OWID’s Github repository.

First, you need to get a url for the csv file’s in the OWID Github repository.

The easiest way to get correct url is to open the csv file in Github in Raw format by clicking on the file name and then clicking on the Raw button on top right corner of page. This will open the csv file in native “raw” format. Then you can simply copy the url from browser location bar.

You can also manually create a url to any Github repository file as follows below. Let’s use the vaccinations.csv file in OWID’s Github repository as an example.

https://raw.githubusercontent.com/owid/COVID-19-data/master/public/data/vaccinations/vaccinations.csv

The vaccinations.csv Github repository file url has the following parts:

    • Base url: https://raw.githubusercontent.com
    • Repository owner: owid
    • Repository name: COVID-19-data
    • Repository branch: master
    • Repository base element: public
    • Repository folders: data/vaccinations
    • Repository file name: vaccinations.csv

Based on this example you should be able to manually create a url for any file in any Github repository.

Once you have your url, you can test if it is correct by simply copy pasting it into a browser location bar and verifying the url is good, and retrieves the data you are looking for.

After verifying the url is correct, go to your Excel file, find the Excel Power Query Get Data feature and select From Other SourcesFrom Web which will show input field for a url. Paste your url into the input field and simply click Load, which will load the csv file data into an Excel worksheet.

Now instead of manually downloading data from Github, you can simply click refresh and automatically retrieve the most recent csv file from the OWID Github repository.

Here is full M-code for a Query connected to the OWID Github “vaccinations.csv”. You can copy the code below and paste it into a blank Query Advanced Editor and it will return OWID data to your Excel file.

 let
Source = Csv.Document(Web.Contents("https://raw.githubusercontent.com/owid/COVID-19-data/master/public/data/vaccinations/vaccinations.csv"),[Delimiter=",", Columns=12, Encoding=65001, QuoteStyle=QuoteStyle.None]),
#"Promoted Headers" = Table.PromoteHeaders(Source, [PromoteAllScalars=true]),
#"Changed Type" = Table.TransformColumnTypes(#"Promoted Headers",{{"location", type text}, {"iso_code", type text}, {"date", type date}, {"total_vaccinations", Int64.Type}, {"people_vaccinated", Int64.Type}, {"people_fully_vaccinated", Int64.Type}, {"daily_vaccinations_raw", Int64.Type}, {"daily_vaccinations", Int64.Type}, {"total_vaccinations_per_hundred", type number}, {"people_vaccinated_per_hundred", type number}, {"people_fully_vaccinated_per_hundred", type number}, {"daily_vaccinations_per_million", Int64.Type}})
in
#"Changed Type"

 

Leave a Comment

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.