Our World In Data (OWID) has been doing a hero’s job of collating the world’s covid vaccine distribution and administration data. They generate additional metrics and allow site visitors to analyse them in awesome interactive visualizations.
However, I wanted to do some analysis that went beyond what was possible on the site. No problem, because the OWID team also provides the data as csv files in their Github repository.
These csv files are easily manually downloaded from Github to be used. But as the OWID vaccine data is updated regularly, often twice per day, you will probably want to automate the retrieval of updated csv files from Github.
There are many tools and methods to automate this but in this blog post I want to quickly highlight how Excel users can use Excel’s Power Query Get Data->From Other Sources->From Web feature to link to the csv files in Github. (Read my other blog post to learn more about Power Query.)
Once implemented you only need to click refresh to get latest csv data from OWID’s Github repository.
First, you need to get a url for the csv file’s in the OWID Github repository.
The easiest way to get correct url is to open the csv file in Github in Raw format by clicking on the file name and then clicking on the Raw button on top right corner of page. This will open the csv file in native “raw” format. Then you can simply copy the url from browser location bar.
You can also manually create a url to any Github repository file as follows below. Let’s use the vaccinations.csv file in OWID’s Github repository as an example.
The vaccinations.csv Github repository file url has the following parts:
- Base url: https://raw.githubusercontent.com
- Repository owner: owid
- Repository name: COVID-19-data
- Repository branch: master
- Repository base element: public
- Repository folders: data/vaccinations
- Repository file name: vaccinations.csv
Based on this example you should be able to manually create a url for any file in any Github repository.
Once you have your url, you can test if it is correct by simply copy pasting it into a browser location bar and verifying the url is good, and retrieves the data you are looking for.
After verifying the url is correct, go to your Excel file, find the Excel Power Query Get Data feature and select From Other Sources – From Web which will show input field for a url. Paste your url into the input field and simply click Load, which will load the csv file data into an Excel worksheet.
Now instead of manually downloading data from Github, you can simply click refresh and automatically retrieve the most recent csv file from the OWID Github repository.