The CBC (Canadian Broadcasting Corporation) news website articles often have a comments section. It would be interesting to see the interactions between comments and replies, and to understand which person makes the most comments, and frequently used words and phrases.
See the results: https://sitrucp.github.io/cbc_comments/image_grid.html
Comments for a specific CBC opinion article are anaysed in detail below.
See a previous post which details how to obtain comments from CBC news and opinion article. Code for this project can be found in this Github repository.
The opinion article was titled “On COVID restrictions, our governments keep firing up the gaslights and shifting the goalposts“. This article garnered 7,800 comments by 1,226 unique users. The comment and user counts include posts and replies. The comments were posted over a two day period beginning Dec 03, 2021 4:00 AM ET after which the comments were locked.
Referring to the line chart below, one can see that 50% (about 615) of the 1,226 users made 90% of the comments. Only 9% (about 105 users) of the users made 50% of the comments!
The “word cloud” chart below shows the names of the top 200 users by comment and reply count. The name size corresponds to user comment and reply counts.
Of the 7,800 comments 1,744 (22%) were “top-level” comments eg they were not directly replying to another comment. The rest 6,056 (78%) were replies to another comment. This indicates a lot of interaction between comments.
The next series of “network” charts below provide some insight into the interactions between users, their comments and replies.
The network charts were created by using the Python NetworkX module. The code used create the NetworkX charts is in the another post.
The red circles (“nodes”) are users. The circle size corresponds to user comment counts. The lines (“edges”) connecting the red circles represent interactions between users as replies to comments. The line arrows indicate who was replying to who.
The first chart is a whole view of the 1,140 users that had at least one reply to their comment. It has 1,140 nodes and 6,000 edges so it makes for a very dense visualization and a big image size. Click on the image to open it in your browser where you will be able to zoom into it and download it if you want.
A closer look below shows more detail. The center of the chart has the users with the greatest number of comments and replies. The outer edges show users with fewer comments and replies.
And another closer looks shows even more detail of the sparse low comment and reply count users on the edges of the chart.
This final “word cloud” visualization shows the top 200 words in all of the comments.