TendiesTown.com – WallStreetBets gain and loss analysis

For a 16 month period, from February 2020 to June 2021, TendiesTown.com used the Reddit API in an automated process to retrieve r/wallstreetbets posts tagged with “Gain” or “Loss” flair. The subReddit r/wallstreetbets is a stock and option trading discussion board that gained popularity during the 2020/2021 “meme stock” media frenzy.

The retrieved posts were automatically categorized by gain or loss, post date, Reddit username, trade verification image url, and post url.

These posts were then manually processed to identify the gain or loss amount and reported stock ticker(s).

As of June 9, 2021, there were 4,582 trades with 2,585 gains and 1,997 losses recorded. Gain amounts totalled USD $389 million and losses USD $117 million.

While 6,844 posts were retrieved from the subreddit, 2,262 (33%) of these were rejected because they were either duplicate posts, not actual gain or loss trades (per r/wallstreetbets flair guidelines) or it was not possible to identify the amount.

The trades and more detailed analysis are available on the tendiestown.com website. As of June 9, 2021, the manual processing is no longer being done, so no new data will appear on the site.

“Gain” and “Loss” flair are described in more detail in the community flair guidelines but essentially “Gain” = trade made money and “Loss”= trade lost money.

Table 1: Summary statistics

Gains Losses
Count 2,585 1,997
Sum $389,329,571 $117,347,985
Avg $150,611 $58,762
Median $24,138 $12,120
Min $138 $100
Max $8,000,000 $14,776,725

The 4,582 trades were made by 3,903 unique Reddit users. 492 of these Reddit users have more than one trade as shown in table 2 below.

Table 2: Trade counts by unique Reddit users

# trades count of Reddit users
1 3,411
2 378
3 73
4 22
5 11
6 5
7 1
8

2

 

Charts 1 & 2: Daily and cumulative counts

Bar charts daily count of gain and loss

Bar charts cumulative count of gain and loss

Charts 3 & 4: Daily and cumulative amounts

Bar charts daily amount of gain and loss

Bar charts cumulative amount of gain and loss

Notable observations

Early 2021 increase in trades

The count and amount of trades rose significantly in the early months of 2021.  This can be explained by the surge of new users to the r/wallstreetbets subreddit due to huge increase in popular media reporting surrounding the GME, AMC meme stock craze.

Gains greater than losses

Gains consistently lead losses over the 16 month period.  Rather than simply concluding that r/wallstreetbets traders gain more than they lose, it suspected that the variance can be explained due to fact that it is easier to tell the world that you won, and harder to say you have lost.

Django form geocode submitted address to get lat, lon and postal code

One of my Django applications includes a form where user can enter and submit a property address.

django form

The user submitting the form might not know the postal code so I left it optional. However the postal code is a key piece of information for this particular application so I wanted to ensure that I was getting it.

I also wanted to geocode the address immediately to get the address latitude and longitude so it could be shown to user on a Leaflet.js map.

There are lots of free geocoding services for low intensity usage but I ended up using Google Geocoding which is free under certain usage level. You just need to create a Geocoding API project and use the credentials to set up the geocoding.

To interact with the geocoding API I tried Python gecoding modules geopy and geocoder but in the end just used Python Requests module instead as it was less complicated.

When the user clicked the Submit button, behind the scenes, Requests submitted the address to Google’s Geocoding API, gets the JSON response containing the latitude, longitude and postal code which are then written to the application database.

I will update the code in future to check if the user’s postal code is correct and replace it if it is incorrect. Will wait to see how the postal code accuracy looks. Making geocoding API requests too often could bump me over the free usage limit.

The Django View that contains the code described above is shown below.

def property_add(request):
   
    property_list = Property.objects.filter(user_id=request.user.id).order_by('created')
    
    if request.method == 'POST':
        form = PropertyForm(request.POST)
        if form.is_valid():
            new_property = form.save(commit=False)
            address = "%s, %s, %s, %s" % (new_property.address1, new_property.city, new_property.state, new_property.postcode)
            google_geocode_key = 'xxxxxxxxxxxxxxxxxxxxxxxxxxxx'
            url = 'https://maps.googleapis.com/maps/api/geocode/json?address=' + "'" + address + "'" + '&key=' + google_geocode_key
            
            try:
                response = requests.get(url)
                geoArray = response.json()
                new_property.lat = geoArray['results'][0]['geometry']['location']['lat']
                new_property.lon = geoArray['results'][0]['geometry']['location']['lng']
                new_postcode = geoArray['results'][0]['address_components'][7]['long_name']
                new_fsa = geoArray['results'][0]['address_components'][7]['short_name'][:3]
            except:
                new_property.lat = None
                new_property.lon = None
                new_postcode = None
                new_fsa = None
           
            if new_property.postcode:
                new_property.fsa = new_property.postcode[:3]
            else:
                new_property.postcode = new_postcode
                new_property.fsa = new_fsa
           
            new_property.user_id = request.user.id
            new_property = form.save()
            return HttpResponseRedirect(reverse(property, args=(new_property.pk,)))
    else:
        form = PropertyForm()

    context_dict = {
        'form': form, 
        'property_list': property_list,
    }
        
    return render(
        request,
        'property_form.html',
        context_dict,
        context_instance = RequestContext(
            request,
            {
                'title':'Add Property',
             }
            )
    )    

 

Leaflet.js choropleth map color by count using geoJSON datasource

I have a Django web application that needed an interactive map with shapes corresponding to Canadian postal code FSA areas that were different colors based on how many properties were in each FSA. It ended up looking something like the screenshot below.

map1

This exercise turned out to be relatively easy using the awesome open-source Javascript map library Leaflet.js.

I used this Leaflet.js tutorial as the foundation for my map.

One of the biggest challenges was finding a suitable data source for the FSAs. Chad Skelton (now former) data journalist at the Vancouver Sun wrote a helpful blog post about his experience getting a suitable FSA data source. I ended up using his BC FSA data source for my map.

Statistics Canada hosts a Canada Post FSA boundary files for all of Canada. As Chad Skelton notes these have boundaries that extend out into the ocean among other challenges.

Here is a summary of the steps that I followed to get my choropleth map:

1. Find and download FSA boundary file. See above.

2. Convert FSA boundary file to geoJSON from SHP file using qGIS.

3. Create Django queryset to create data source for counts of properties by FSA to be added to the Leaflet map layer.

4. Create Leaflet.js map in HTML page basically the HTML DIV that holds the map and separate Javascript script that loads Leaflet.js, the FSA geoJSON boundary data and processes it to create the desired map.

Find and download FSA boundary file.

See above.

Convert FSA boundary file to geoJSON from SHP file using qGIS.

Go to http://www.qgis.org/en/site/ and download qGIS. Its free and open source.

Use qGIS to convert the data file from Canada Post or other source to geoJSON format. Lots of blog posts and documentation about how to use qGIS for this just a Google search away.

My geoJSON data source looked like this:

var bcData = {
    "type": "FeatureCollection",
    "crs": { "type": "name", "properties": { "name": "urn:ogc:def:crs:EPSG::4269" } },
    "features": [
    { "type": "Feature", "properties": { "CFSAUID": "V0A", "PRUID": "59", "PRNAME": "British Columbia \/ Colombie-Britannique" }, "geometry": { "type": "MultiPolygon", "coordinates": [ [ [ [ -115.49499542, 50.780018587000029 ], [ -115.50032807, 50.77718343600003 ], [ -115.49722732099997, 50.772528975000057 ], [ -115.49321284, 50.770504059000075 ], [ -115.49393662599999, 50.768143038000062 ], [ -115.50289288699997, 50.762270941000054 ], [ -115.50846411599997, 50.754243300000041 ], [ -115.5104796, 50.753297703000044 ], [ -115.51397592099994, 50.748953800000038 ], [ -115.51861431199995, 50.745737989000077 ], [ -115.52586378899997, 50.743771099000071 ], [ -115.53026371899995, 50.74397910700003 ], [ -115.53451319199996,

 

Create Django queryset to create data source for counts of properties by FSA to be added to the Leaflet map layer.

I used a SQL query in the Django View to get count of properties by FSA.

This dataset looks like this in the template. These results have only one FSA, if it had more it would have more FSA / count pairs.

Below is code for  the Django view query to create the fsa_array FSA / counts data source.

    

cursor = connection.cursor()
    cursor.execute(
    "select fsa, count(*) \
    from properties \
    group by fsa \
    order by fsa;")
    fsas_cursor = list(cursor.fetchall())

    fsas_array = [(x[0].encode('utf8'), int(x[1])) for x in fsas_cursor]

My Javascript largely retains the Leaflet tutorial code with some modifications:

1. How the legend colors and intervals are assigned is changed but otherwise legend functions the same.

2. Significantly changed how the color for each FSA is assigned. The tutorial had the color in its geoJSON file so only had to reference it directly. My colors were coming from the View so I had to change code to include new function to match FSA’s in both my Django view data and the geoJSON FSA boundary file and return the appropriate color based on the Django View data set count.

var fsa_array = [["V3J", 19]];

var map = L.map('map',{scrollWheelZoom:false}).setView([ active_city_center_lat, active_city_center_lon], active_city_zoom);

map.once('focus', function() { map.scrollWheelZoom.enable(); });

var fsa_array = fsas_array_safe;

L.tileLayer('https://api.tiles.mapbox.com/v4/{id}/{z}/{x}/{y}.png?access_token=pk.eyJ1IjoibWFwYm94IiwiYSI6ImNpandmbXliNDBjZWd2M2x6bDk3c2ZtOTkifQ._QA7i5Mpkd_m30IGElHziw', {
    maxZoom: 18,
    attribution: 'Map data ©' + OpenStreetMap CC-BY-SA' + 'Imagery ©' + 'Mapbox'
    id: 'mapbox.light'
}).addTo(map);

// control that shows state info on hover
var info = L.control();

info.onAdd = function (map) {
    this._div = L.DomUtil.create('div', 'info');
    this.update();
    return this._div;
};

info.update = function (props) {
    this._div.innerHTML = (props ?
        '' + props.CFSAUID + ' ' + getFSACount(props.CFSAUID) + ' lonely homes' 
        : 'Hover over each postal area to see lonely home counts to date.');
};

info.addTo(map);

// get color 
function getColor(n) {
    return n > 30 ? '#b10026'
           : n > 25 ? '#e31a1c' 
           : n > 25 ? '#fc4e2a' 
           : n > 20 ? '#fd8d3c'
           : n > 15  ? '#feb24c'
           : n > 10  ? '#fed976'
           : n > 5  ? '#ffeda0'
           : n > 0  ? '#ffffcc'
           : '#ffffff';
}     

function getFSACount(CFSAUID) {
    var fsaCount;
    for (var i = 0; i < fsa_array.length; i++) {
        if (fsa_array[i][0] === CFSAUID) {
            fsaCount = ' has ' + fsa_array[i][1];
            break;
        }
    }
    if (fsaCount == null) {
         fsaCount = ' has no '; 
    }
    return fsaCount;
}

function getFSAColor(CFSAUID) {
    var color;
    for (var i = 0; i < fsa_array.length; i++) {
    if (fsa_array[i][0] === CFSAUID) {
        color = getColor(fsa_array[i][1]);
        //console.log(fsa_array[i][1] + '-' + color)
        break;
        }
    }
    return color;
}
    
function style(feature) {
    return {
        weight: 1,
        opacity: 1,
        color: 'white',
        dashArray: '3',
        fillOpacity: 0.7,
        fillColor: getFSAColor(feature.properties.CFSAUID)
    };
}

function highlightFeature(e) {
    var layer = e.target;
    layer.setStyle({
        weight: 2,
        color: '#333',
        dashArray: '',
        fillOpacity: 0.7
    });

    if (!L.Browser.ie && !L.Browser.opera) {
        layer.bringToFront();
    }

    info.update(layer.feature.properties);
}

var geojson;

function resetHighlight(e) {
    geojson.resetStyle(e.target);
    info.update();
}

function zoomToFeature(e) {
    map.fitBounds(e.target.getBounds());
}

function onEachFeature(feature, layer) {
    layer.on({
        mouseover: highlightFeature,
        mouseout: resetHighlight,
        click: zoomToFeature
    });
}

geojson = L.geoJson(bcData, {
    style: style,
    onEachFeature: onEachFeature
}).addTo(map);

var legend = L.control({position: 'bottomright'});

legend.onAdd = function (map) {

    var div = L.DomUtil.create('div', 'info legend'),
        grades = [0, 1, 5, 10, 15, 20, 25, 30],
        labels = [],
        from, to;

    for (var i = 0; i < grades.length; i++) {
        from = grades[i];
        if (i === 0) {
            var_from_to = grades[i];
            var_color = getColor(from);
        } else {
            var_from_to =  from + (grades[i + 1] ? '–' + grades[i + 1] : '+') ;
            var_color = getColor(from + 1);
        }
        
        labels.push(
            ' ' +
             var_from_to);
    }

    div.innerHTML = labels.join('');
    return div;
};

legend.addTo(map);

That is pretty much all there is to creating very nice looking interactive free open-source choropleth maps for your Django website application!

Django recreate database table

Django’s makemigrations and migrate commands are very useful to update existing database tables to reflect model changes.

However if you have made many existing table column name changes, migrate will ask you a series of ‘y/N’ questions about which column names are changed. This can be tedious to cycle through especially if there are many changes.

Depending on the relationships your table has, it may be easier and quicker to:

  • Create a backup of the table by copying and renaming table or exporting table data to csv
  • Drop the table
  • Recreate table from scratch
  • Reload data into the new updated table

The question is how to recreate the table?

After you drop the table you can remove your table model from the models.py field and then run makemigrations and then run migrate —fake which is the special trick to get past migrate wanting your table to exist before it can delete it.

Then after you run migrate –fake, you can put your update model for your table back into your models.py and then makemigrations and migrate and you will get your new updated table recreated in database.

Then you can recover your data from the backup with SQL INSERT or by using database data import feature.