COVID-19 Data Analysis and Visualization Summary

This is a list of Canadian COVID-19 related data analysis and visualization that I created during 2020/21 pandemic.

Canada COVID-19 Case Map – COVID-19 new and total cases and mortalities by Canadian provincial health regions.
https://sitrucp.github.io/canada_covid_health_regions

Canada COVID-19 Vaccination Schedule – Canada’s current and historical vaccine doses administered and doses distributed. Also includes two distribution forecasts: 1) based on Canadian government vaccine planning and 2) based on Sep 30 2021 goal to vaccinate all 16+ Canadians. Uses COVID-19 Canada Open Data Working Group data.
https://sitrucp.github.io/covid_canada_vaccinations

Canada COVID-19 Vaccination vs World – Canada’s current and historical ranking vs all countries in the Our World in Data coronavirus data  by total doses, daily doses and total people vaccines adminstered.
https://sitrucp.github.io/covid_global_vaccinations

Global COVID-19 Vaccination Ranking – Ranking of all countries in the Our World in Data coronavirus data by daily vaccine dose administration. Includes small visualization of all time, population, vaccines used and trend. Can sort by these measures.
https://sitrucp.github.io/covid_world_vaccinations

COVID-19 New Case Rate by Canadian Health Regions Animation – SVG animation of new cases visualized as daily rate for each Canadian provincial health regions. Like a horse race, faster moving dot means higher daily rate.
https://sitrucp.github.io/covid_rate_canada

COVID-19 New Case Rate by Country Animation – SVG animation of new cases visualized as daily rate for each country in the Our World in Data dataset. Like a horse race, faster moving dot means higher daily rate.
https://sitrucp.github.io/covid_rate_world

COVID-19 New Case Rate by US State Animation – SVG animation of new cases visualized as daily rate for each US state. Like a horse race, faster moving dot means higher daily rate.
https://sitrucp.github.io/covid_rate_us

Apple Mobility Trends Reports – Canadian Regions Data – Apple cell phone mobility data tracking data used to create heat map visualizations of activity over time.
https://sitrucp.github.io/covid_canada_mobility_apple

WHO Draft landscape of COVID-19 Candidate Vaccines – AWS Textract used to extract tabular data from WHO pdf file. Python and Javascript code then used to create webpages from extracted data.
https://sitrucp.github.io/covid_who_vaccine_landscape

Montreal Confirmed COVID-19 Cases By City Neighbourhoods – Code and process used to scrape Quebec Health Montreal website to get COVID-19 case data for Montreal city boroughs.
https://github.com/sitrucp/covid_montreal_scrape_data

Use Excel Power Query to get data from Our World In Data – How to use Excel’s Power Query to get Our World in Data Github csv files automatically and update with simple refresh.
https://009co.com/?p=1491

 

D3.js SVG animation – COVID-19 rate “race” visualization

This visualization shows COVID-19 new cases as a “race” of dots moving from left to right.

The dot’s “speed” or how long it takes to move from left to right is based on the number of cases per day.

If a country has one case per day, it will take an entire day for the dot to move from left to right. Some countries have many 1000’s of new cases daily and the dot moves from left to right in minutes or seconds.

There are three  visualizations for following geographical regions. Click “viz” to view the visualization and “github code” to view the code for the visualization:

The screenshot below shows countries of world. Some countries have not had any new cases over past 7 days so show as gray. Those that have had new cases over past 7 days are shown as white circle (no change from prev 7 days), red (increase from prev 7 days) or green (decrease from prev 7 days).

The visualization is sorted by country by default but can change sorting by average new cases. In addition, you can toggle between showing new cases as actual count or new cases per million (population).

The visualization uses D3.js SVG to create a canvas for each location, the location name text & counts, and circle shape, and transitions, and to retrieve csv file and process data, including filtering to most recent 7 days, group by location to get case count means.

The most important aspect for this visualization was how to use D3.js to animate the movement of the white circle across the canvas, and how to repeat the movement in an ‘endless’ loop.

The code block below hightlights use of a function that uses D3.js .on(“end”, repeat);  to loop through repeat function ‘endlessly’ so that shape is moved across canvas, and then back to original position, to move across canvas again and again. See bl.ocks.org ‘Looping a transition in v5’ example.

The duration() value is the proxy for rate in this visualization and is calculated in another function separately for each location SVG. I also added a counter that would increment an SVG text value to show each loop’s count on canvas.

// repeat transition endless loop
function repeat() {
    svgShape
    .attr("cx", 150)
    .transition()
    .duration(cycleDuration)
    .ease(d3.easeLinear)
    .attr("cx", 600)
    .transition()
    .duration(1)
    .attr("cx", 150)
    .on("end", repeat);
    
    svgTextMetric
    .text(counter + ' / ' + metric);
    counter++;
  };

This visualization was inspired by Jan Willem Tulp’s COVID-19 spreading rates and Dr James O’Donoghue’s  relative rotation periods of planets, and uses same data as Tulp’s spreading rates.

Heat maps of Canadian activity changes due to COVID-19 using Google Community Mobility Reports

During the 2020 COVID-19 pandemic in Canada I wanted to get better understanding of the geographical distribution of COVID-19 related activity changes across Canada.

Google has helpfully provided freely available global “Community Mobility Reporting” which shows Google location history change compared to baseline by country, and country sub-regions. These provide changes in activity by location categories: Workplace, Retail & Recreation, Transit Stations, Grocery & Pharmacy and Parks locations, and Residential locations. For Canada it is available by province. As of April 19, data contained daily values from Feb 15 to Apr 11.

The Community Mobility Reporting data is available as a single csv file for all countries at Google Community Mobility Report site. In addition, Google provides feature to filter for specific country or country sub regions eg state or provinces, etc and download resulting PDF format.

As the COVID-19 lockdowns occurred across Canada you would expect that people were less likely to be in public spaces and more likely to be at home. The Community Mobility Reporting location history allows us to get some insight into whether or not this happened, and if it did, to what degree and how this changed over time.

I used the Community Mobility Report data to create a D3.js heat map visualization which is described in more detail below and in this Github repository.

I also created an Excel version of this heat map visualization using Pivot Table & Chart plus conditional formatting. This Excel file, described in more detail below, is available in the Github repository.

More detail and screenshots of visualizations is provided below:

Heatmaps
Heatmaps are grids where columns represent date and rows province/territory. Each heatmap is a grid representing a single mobility report category. The grid cell colors represent value of percent change which could be positive or negative. Changes can be observed as lockdowns occurred where locations in public areas decreased relative to baseline. Inversely, residential location increased relative to baseline as people sheltered in place at their homes.

1) Heatmap created using Excel / Power Query: For this heatmap visualization the global csv data was transformed using Excel Power Query. The Excel file has two Pivot Table and Chart combos. The Excel files and Power Query M Code are in the repository. Excel files are available in Github repository.

2) Heatmap created using D3.js: For this heatmap visualization the global csv data was transformed using Excel Power Query. The heatmap visualization was created using slightly modified code from ONSvisual.

Bar charts
These were created using Excel to visualize percent change by Province/Territory and location category using Excel / Power Query. These allow comparison between provinces by date and category. This Excel / Power Query file can be used for analytical purposes to slice and dice global data by date, country, sub region 1 & 2 and category. Excel files are available in Github repository.

Choropleth map of Canada COVID-19 cases by health region using Leaflet and D3.js

During the early days of the 2020 COVID-19 pandemic in Canada, I wanted to get better understanding of the geographical distribution of COVID-19 cases across Canada.

At the time, government or news agencies were only mapping case counts by province. However Canadian provinces are so big compared to population centers that it doesn’t accurately reflect actual geographic distribution of cases. It would be better to use the provincial “health regions” which correspond much better to population centers.

So I set about to create for myself a choropleth map visualization by health regions.

View the finished choropleth map at the following link. The source data is updated daily each evening.
https://sitrucp.github.io/canada_covid_health_regions/index.html

I used Leaflet.js open-source JavaScript mapping library to create the interactive choropleth map, D3.js to retrieve and transform the csv format data, and Javascript to retrieve the JSON geographic boundary files and also to manipulate and present the data with HTML and CSS.

The COVID-19 case count data are obtained as csv file format from the “COVID-19 Canada Open Data Working Group” who are an amazing group of volunteers.  They have been tirelessly collating data from the various provincial and territory government agencies daily since early March. This group saves the collated and cleaned data as csv files in a Github repository https://github.com/ishaberry/Covid19Canada.

The health region geographical boundary descriptions are from Statistics Canada’s Statscan ArcGIS Health region boundary Canada dataset. These had very detailed boundaries so I simplified them using QGIS which also dramatically reduced the dataset size.

However, there were some data issues that needed to be addressed first. The Statscan health regions shape file boundary names are different than those used by the provincial and territory government agencies reporting the data.

The Statscan seems to have full-form “official” health region names, while the provincial and territory government agency names are common, more familar, short-hand names. Also, names appeared to have changed since they were recorded in Statscan data.

Provinces may also add or remove health regions from time to time due to administrative changes or population changes etc. So either set may have health regions that the other doesn’t have.

From a data governance perspective, in a perfect world, everyone uses a single set of health region boundary names. COVID-19 reporting has made a lot of people aware of this issue which is a silver lining in the COVID-19 dark cloud!

Addressing these name differences was actually quite simple, requiring creation of a lookup table with two columns, one for each dataset, to match the names in the boundary data files to the names in the counts data file. The lookup table can then be used dynamically when getting data each time the map is refreshed. This is described in more detail in Github repository README linked below.

Code for this project is maintained in Github:  github.com/sitrucp/canada_covid_health_regions.

I also created a separate choropleth map for Montreal, where I was living at the time, which was Canada’s COVID-19 “hotspot” with about 25-30% of Canada’s total COVID-19 cases. However, the Montreal data source has since been discontinued so the map is archived now.

View archived Montreal map here:
https://sitrucp.github.io/canada_covid_health_regions/montreal/index.html

AWS S3 csv file as D3 report data source

This is an example of how to read a csv file retrieved from an AWS S3 bucket as a data source for a D3 javascript visualization.

The D3 visualization would be an HTML document hosted on a web server. 

You will use the AWS SDK to get the csv file from the S3 bucket and so you need to have an AWS S3 bucket key and secret but I won’t cover that in this post.

The key point of this post is to highlight that the bucket.getObject function data is read into D3 using  d3.csv.parse(data.Body.toString());  

Another note is that d3.csv.parse is for D3 version 3. Older versions use d3.csvParse. 

Once implemented, whenever the webpage is refreshed it retrieves latest csv file from the S3 bucket and the D3 visualization is updated.

<script src="https://sdk.amazonaws.com/js/aws-sdk-2.6.3.min.js"></script>

<script type="text/javascript">

// aws key and secret (note these should be retrieved from server not put as plain text into html code)
AWS.config.accessKeyId = 'xxxxxxxxxxxxxxxxxxxxxxx';
AWS.config.secretAccessKey = 'xxxxxxxxxxxxxxxxxxxxxxx';
AWS.config.region = 'us-east-1';

// create the AWS.Request object
var bucket = new AWS.S3();

// use AWS SDK getobject to retrieve csv file
bucket.getObject({
    Bucket: 'my-S3-bucket', 
    Key: 'myfile.csv'
}, 

// function to use the data retrieve 
function awsDataFile(error, data) {
    if (error) {
        return console.log(error);
    }

        // this where magic happens using d3.csv.parse to read myCSVdata.Body.toString()
    myCSVdata = d3.csv.parse(data.Body.toString()); 

        // now loop through data and get fields desired for visualization 
    var counter = 0;
    myCSVdata.forEach(function(d) {
            d.field1= +d.field1;
            d.field2= +d.field2;
            countLoop = counter++;
    });

        // now you can create rest of D3 vizualization here 
        // for example like this example https://gist.github.com/d3noob/4414436

        my D3 vizualization code here

// this closes bucket.getObject 
});

</script>