Here’s Waldo: Computing the optimal search strategy for finding Waldo


As I found myself unexpectedly snowed in this weekend, I decided to take on a weekend project for fun. While searching for something to catch my fancy, I ran across an old Slate article claiming that they found a foolproof

Posted in analysis, data visualization Tagged with: , , , ,

Python usage survey 2014


Remember that Python usage survey that went around the interwebs late last year? Well, the results are finally out and I’ve visualized them below for your perusal. This survey has been running for two years now (2013-2014), so where we

Posted in data visualization, python Tagged with: , ,

The Shrinking Battleground: Every 4 years, fewer states determine the outcome of the Presidential election


Every 4 years, Americans are tasked to elect the leader of one of the largest democratic nations in the world. Eager to have their voices heard, U.S. citizens from every state stand in line to cast their ballot for their

Posted in data visualization Tagged with: , , , ,

A data-driven guide to creating successful reddit posts, redux


A couple years ago, I wrote an article using massive data set of reddit posts to tackle one of the more popular questions about reddit: How do I get a highly-upvoted post on reddit? In light of the recent findings

Posted in analysis, data visualization, reddit Tagged with: , ,

Over half of all reddit posts go completely ignored


A couple years ago, Eric Gilbert published a research article showing that more than half (~52%) of all popular links submitted to /r/pics go completely ignored the first time they’re posted. I found this phenomenon to be strange because those

Posted in data visualization, reddit Tagged with: , , ,

The biggest box office booms and busts since 1982


If you read my last post about the correlation between a film’s budget and its performance in the box office, you were possibly intrigued about my mentions of the biggest box office successes and failures. I decided to focus on

Posted in data visualization Tagged with: , ,

Does a bigger film production budget result in more ticket sales?


If you take a stroll down a list of the most expensive films of all time, you’ll notice that most of the films are from the past 15 years. Every year, more and more money is being poured into producing

Posted in data visualization Tagged with: , ,

Top 10 grossing film studios in the U.S. (1982-2014)


Around this time last year, I kicked off a series of movie analysis posts to wrap up the year. In keeping with tradition, I figured I’d do the same this year. This year, I’ll be looking at box office sales

Posted in data visualization Tagged with: , ,

Top 25 most gender-neutral names in the U.S.


As a long-time fan of Saturday Night Live, I have fond memories of the Pat sketch where Pat’s friends were always trying to figure out his/her gender through a series of hilarious indirect tests. Despite their every effort — from

Posted in data visualization Tagged with: , ,

What caused the upsurge of unique American baby names in the 1970s?


Last week, I was exploring the ever-popular U.S. baby names data set and noticed a peculiar trend: The number of unique baby names has continued to rise dramatically for the last ~130 years — with the exception of the past

Posted in analysis, data visualization Tagged with: , , ,

About this blog

The data visualizations on this blog are the result of my “data tinkering” hobby, where I tackle a new data analysis problem every week. If I find something interesting, I report my findings here to share with the world.

If you would like to use one of my graphs on your website or in a publication, please email me.


Enter your email address to subscribe to this blog and receive notifications of new posts by email.