Author: Randy Olson
Dr. Randy Olson is a Senior Data Scientist at the University of Pennsylvania, where he develops state-of-the-art machine learning algorithms with a focus on biomedical applications.

TPOT: A Python tool for automating data science

Machine learning is often touted as: A field of study that gives computers the ability to learn without being explicitly programmed. Despite this common claim, anyone who has worked in the field knows that designing effective machine learning systems is

Why did so many Japanese families avoid having children in 1966?

Last week, I was presenting at a conference and discussing the merits of animated visualizations vs. small multiples. On one of my slides, I presented the following chart that shows the total fertility rate (i.e., the average number of children

Spurious Extrapolations: Novel and unique research abstracts

Last Christmas, BMJ published a funny article exploring the mentions of positive and negative words in research abstracts over the past 40 years. I’ve recreated their research for two of the phrases below — “novel” and “unique.” Your eyes aren’t

Spurious Extrapolations: What if U.S. college tuition costs keep rising?

For this post, I’m going to test run a new post series called Spurious Extrapolations, where I extrapolate time series far beyond reason and envision what would happen if the trend continued. Let me know what you think of the

The correct way to use pie charts

Pie charts are the most widely berated chart in data visualization. Many articles have been written over the years describing why pie charts are bad, and why we should no longer use them. Even key members of the data visualization

Why posts get removed from /r/DataIsBeautiful

I’ve been a moderator of /r/DataIsBeautiful — one of the largest online communities dedicated to data analysis and visualization — for the past 2 1/2 years. During that time, I’ve reviewed thousands of data visualizations created by amateurs and professionals

What data visualization tools do /r/DataIsBeautiful OC creators use?

One of the most common questions that newcomers to data [science/visualization/analysis] ask is: “What tools should I use to create data visualizations?” While I always recommend learning design principles before tools, I thought I’d take a stab at answering that

Major League Baseball home run leaders, 1871-2016

Earlier this week, a Reddit user shared a fascinating animated data visualization showing the MLB home run leaders from the past 200+ years. I found this visualization especially interesting because it was one of the few examples where I’ve seen

Revisiting the vaccine visualizations

Last year, the vaccination debate was all the rage again. “Pro-vaxxers” were loudly proclaiming that everyone should get vaccinated and discussing the science behind it, and “anti-vaxxers” were casting their doubts and still refusing to get vaccinated for personal reasons.

Analyzing MMA: The Ultimate Fighting Championship

For the past 7 years, I’ve been a fan of MMA, and especially the larger Ultimate Fighting Championship events that take place around the world. For the uninitiated, MMA fights pit two professional fighters against each other who often have

