The Geography of the Superbowl

Charting the Location of Twitter Users during the 2017 Superbowl

I started thinking about this project as I was building the Twitter Stream project. Twitter has an API that allows you to filter and stream tweets based on hashtags, specific users, or geospatial restrictions. I collected every available tweet mentioning either team for the duration of the 2017 Superbowl. About 2,000,000 tweets in total. Analysis was done in Python. This page serves up a static .csv file with total count and proportion data.

The first chart is the total number of tweets. Pretty much as you would expect - dominated by the largest states and the two states principally involved in the game. Twitter rate limits streaming through the API. Usually they limit streaming capabilities to 1/10 of the total stream and moreso during periods of high traffic- like the Superbowl. The count listed here is far lower than true tweet total.


total number of tweets

The first chart suffers from the geographic profile effect. The second chart, which I think is more interesting, is a chart of the tweets per person.


tweets per person

This chart shows much more clearly the parties involved in the super bowl (in case you've forgotten, the game was between the New England Patriots and the Atlant Falcons). Lots to comment on here. One thing is apparent between both graphs: Texans love football. It's interesting that Nevada tweets a lot about the Superbowl. I imagine this is due to legalized sports gambling. In states where there are no NFL teams, people do not watch the Superbowl. I'm shocked.

South Carolinians tweeted the most frequently (0.0039 tweets/person), tweeting at nearly twice the rate of the second highest state, Georgia. South Carolinians must be huge Falcons fans or at least, relished in the neighbor's misery. Massachusettans(?) tweeted at a rate of 0.0015 tweets/person. I also appreciate that Alaska and Hawaii, despite the geographical distance, seemingly talk about the SuperBowl quite a lot. Keep in mind that due to API limiting, this is not the true rate of tweeting and should be only thought of as a vague 'frequency metric'.