Makeover Monday Social Data and Visualisation Challenge - 02/07/2018

Makeover Monday Social Data and Visualisation Challenge - 02/07/2018

Rats!
They fought the dogs and killed the cats,
And bit the babies in the cradles,
And ate the cheeses out of the vats,
And licked the soup from the cooks’ own ladles,
Split open the kegs of salted sprats,
Made nests inside men’s Sunday hats,
And even spoiled the women’s chats,
By drowning their speaking
With shrieking and squeaking
In fifty different sharps and flats.
— Robert Browning (1842), The Pied Piper of Hamelin

NYC Open Data released some time ago a dataset of 311 calls reporting rats between 2010 and 2017. This week's Makeover Monday Challenge is to create a new visualisation that best shows off what this data has to say.

I came across Makeover Monday only relatively recently, but am seeing it as a great exercise to be set a challenge, to complete that challenge in a limited space of time (1 hour), to get to grips with an unfamiliar platform (Tableau) and to think about how to improve an existing visualisation (as a critical friend).  

For the uninitiated, Makeover Monday is a social data project in which people take an existing visualisation and accompanying dataset and unleash their creativity and knowledge in a bid to create something and share it with the world.  At this point in my Tableau journey, I'm probably just about able to participate, and I'm hoping that future efforts show a clear progression in terms of technical ability and how visually pleasing my visualisations are! As you might be able to guess, each Monday a new visualisation challenge is released and people are asked to share their work on social media and via Tableau Public.

The subject matter of this week's challenge is admittedly slightly ghoulish but worth examining in detail.  As ever, with anything like this, I can't help but read up a little on the subject matter beforehand.  This map by NBC New York depicts the stomach churning concept of 'rat reservoirs', spurred by the death of a man from Leptospirosis.  Rat reservoirs are rodent hot zones and can sometimes be correlated with higher levels of rat sightings in a particular area.  Hence, NYC Department of Health and Mental Hygiene target such areas for extra exterminator duties.  I have to say while typing this that I have a strong urge to stand on my chair and scream for a constable.  The serious point is that rats are a severe threat to human health and as such this merits the level of data collection that is undertaken, and I have to take my hat off to a public authority in releasing a dataset like this; some national and regional governments would balk at doing so because of the negative perception that might arise from it being perceived as evidence of vermin, rather than the steps taken by a government to address a human health problem.

I've included a link to my visualisation on Tableau Public below.  

The original visualisation which spurred this challenge plotted sightings by day, with a trend line tracking the fact that sightings had increased over time.  I have adapted this in my own visualisation to track both the number of sightings per month in aggregate, and a cumulative sum over the course of each year in order to compare year on year performance.  Sure enough, we see that the number of rat sightings is increasing; yet it's difficult from the dataset to establish if New Yorkers are getting better at rat detection, or if the number of rats is increasing or both.

As someone who works in the real estate sector in public policy, I'm naturally curious if certain building types witness more rat sightings than others.  3+ bed apartments come out top.  It turns out (I found out afterward having located the author of the original visualisation's website) that a similar graph was drawn.  My visualisation, however, plots the count on a logarithmic scale on the x axis, which I think gives a clearer understanding of the scale of sightings outside of the top 5 largest categories of buildings with rat sightings and avoids superimposing numbers on the bar chart.  Yet, this is a design/aesthetics choice I think.

Finally, I created a map and tried to plot sightings by neighbourhood at a lower level of aggregation than Borough.  This required a little feature engineering but ultimately I faced a choice between meeting the time deadline I gave myself and delivering a fully accurate feature; at the close, there were a small number of null values. 

Conclusions

I gave myself an hour to complete the visualisation and so not everything is as I would have wished. Yet the great thing about this is that in learning by doing I uncover a whole list of things to review and to look into. Among these is examining how to use hexbins in order to create geographic heatmaps, which would have been a nice feature to add to the map in order to be able to try to determine where sightings are densest geographically. 

More Resources

In addition to Makeover Monday, there is also the Monthly Storytelling with Data Challenge.

Makeover Monday: Percentage of Patients Spending 4 Hours of Less in A&E by Provider

Makeover Monday: Percentage of Patients Spending 4 Hours of Less in A&E by Provider

Why I'm Studying Data Science

Why I'm Studying Data Science