Spurious Extrapolations: Novel and unique research abstracts

word-frequency-research-abstracts-extrapolated

Last Christmas, BMJ published a funny article exploring the mentions of positive and negative words in research abstracts over the past 40 years. I’ve recreated their research for two of the phrases below — “novel” and “unique.” Your eyes aren’t

Posted in analysis, data visualization Tagged with: , , ,

Spurious Extrapolations: What if U.S. college tuition costs keep rising?

spurious-extrapolations-mwh-tuition-us-2100

For this post, I’m going to test run a new post series called Spurious Extrapolations, where I extrapolate time series far beyond reason and envision what would happen if the trend continued. Let me know what you think of the

Posted in analysis, data visualization Tagged with: , ,

The correct way to use pie charts

pie-chart-too-many-categories

Pie charts are the most widely berated chart in data visualization. Many articles have been written over the years describing why pie charts are bad, and why we should no longer use them. Even key members of the data visualization

Posted in data visualization, tutorial Tagged with: ,

Why posts get removed from /r/DataIsBeautiful

DIB-post-removal-reasons-fractions

I’ve been a moderator of /r/DataIsBeautiful — one of the largest online communities dedicated to data analysis and visualization — for the past 2 1/2 years. During that time, I’ve reviewed thousands of data visualizations created by amateurs and professionals

Posted in data visualization, reddit Tagged with: , , ,

What data visualization tools do /r/DataIsBeautiful OC creators use?

DIB-tools-used-bar

One of the most common questions that newcomers to data [science/visualization/analysis] ask is: “What tools should I use to create data visualizations?” While I always recommend learning design principles before tools, I thought I’d take a stab at answering that

Posted in data visualization, reddit Tagged with: , ,

Major League Baseball home run leaders, 1871-2016

mlb-baseball-homerun-records

Earlier this week, a Reddit user shared a fascinating animated data visualization showing the MLB home run leaders from the past 200+ years. I found this visualization especially interesting because it was one of the few examples where I’ve seen

Posted in data visualization, python, tutorial Tagged with: , ,

Revisiting the vaccine visualizations

measles-cases-heatmap-sequential-colormap

Last year, the vaccination debate was all the rage again. “Pro-vaxxers” were loudly proclaiming that everyone should get vaccinated and discussing the science behind it, and “anti-vaxxers” were casting their doubts and still refusing to get vaccinated for personal reasons.

Posted in data visualization, python, tutorial Tagged with: , ,

Analyzing MMA: The Ultimate Fighting Championship

ufc-weight-class-stacked

For the past 7 years, I’ve been a fan of MMA, and especially the larger Ultimate Fighting Championship events that take place around the world. For the uninitiated, MMA fights pit two professional fighters against each other who often have

Posted in data visualization Tagged with: , ,

Introducing TPOT, the Data Science Assistant

An example machine learning pipeline

Some of you might have been wondering what the heck I’ve been up to for the past few months. I haven’t been posting much on my blog lately, and I haven’t been working on important problems like solving Where’s Waldo?

Posted in machine learning, python, research Tagged with: , , , , ,

Visualizing Indego bike share usage patterns in Philadelphia (Part 2)

indego-bike-station-usage-21st-catharine-flipped.png

A couple months ago, I made an initial foray into understanding the usage patterns of Indego, Philadelphia’s new bike share system. This month, I thought it’d be a fun exercise to revisit that data set to see if I could

Posted in data visualization Tagged with: , ,

About this blog

This blog is my labor of love, and I've spent hundreds of hours working on the projects that you'll read about here. Generally, I write about data visualization and machine learning, and sometimes explore out-of-the-box projects at the intersection of the two. I hope you enjoy my projects as much as I have.

If you would like to use one of my graphs on your website or in a publication, please feel free to do so with appropriate attribution, but I would appreciate it if you email me first to let me know.

Archives

Subscribe

Enter your email address to subscribe to this blog and receive notifications of new posts by email.