Introducing TPOT, the Data Science Assistant

An example machine learning pipeline

Some of you might have been wondering what the heck I’ve been up to for the past few months. I haven’t been posting much on my blog lately, and I haven’t been working on important problems like solving Where’s Waldo?

Posted in machine learning, python, research Tagged with: , , , , ,

Visualizing Indego bike share usage patterns in Philadelphia (Part 2)

indego-bike-station-usage-21st-catharine-flipped.png

A couple months ago, I made an initial foray into understanding the usage patterns of Indego, Philadelphia’s new bike share system. This month, I thought it’d be a fun exercise to revisit that data set to see if I could

Posted in data visualization Tagged with: , ,

Small multiples vs. animated GIFs for showing changes in fertility rates over time

usa-vs-japan-fertility-rates-small-multiple-subset

A couple weeks ago, Stephen Holzman shared an animated GIF on /r/DataIsBeautiful that caught my eye. The GIF showed the evolution of fertility rates of the U.S. and Japan between 1947 and 2010, which starts right in the middle of

Posted in data visualization Tagged with: , ,

U.S. college majors: Median yearly earnings vs. gender ratio

us-college-majors-income-vs-gender-ratio-ann

Last year, I looked at the gender ratios across college majors and discovered an interesting-yet-spurious correlation: College majors with higher male:female ratios (i.e., with more men than women) tend to have students with higher estimated IQs. After much debate, the

Posted in analysis, data visualization Tagged with: , ,

Analyzing the health of Philadelphia’s bike share system

indego-pct-healthy-bar

Last month, I wrote about my initial attempts to model and predict the usage patterns of Indego, Philadelphia’s new bike share system. To recap: If you’ve ever used a bike share before, you know that one of the biggest fears

Posted in analysis, data visualization Tagged with: , , , ,

The New York Times weather chart redux

nyc-weather-july14-june15-annotated

One of my favorite pastimes is recreating and updating old New York Times graphics. It’s great practice decomposing graphs into reproducible elements, and I always learn a ton about good graphic design in the process. If you’re still learning data

Posted in data visualization, python Tagged with: , ,

Use the Baby Name Explorer to find out when your name was popular

us-baby-name-explorer-hillary

Around the same time I was working on the Name Age Calculator, I developed a simple tool to visualize trends in American baby names. Ever inventive, I named this web app the U.S. Baby Name Explorer. The idea behind the

Posted in data visualization Tagged with: ,

Can the Name Age Calculator guess how old you are?

name-age-calculator-khaleesi

Can you guess someone’s age when all you know is their first name? That was the crazy idea behind one of FiveThirtyEight’s articles last year, and their surprising answer is, “Yes.” The idea behind guessing someone’s age based on their

Posted in analysis, data visualization Tagged with: , , ,

Visualizing Indego bike share usage patterns in Philadelphia

day-by-day-usage-patterns-The Children's Hospital of Philadelphia (CHOP)

One of the many things that I love about my new home town of Philadelphia is that the government openly shares curated data sets covering most of the governmental functions. Since I recently joined Philadelphia’s Indego bike share program, I

Posted in analysis, data visualization Tagged with: , , , ,

Rethinking the population pyramid

pop_pyramid_rotated_annotated

If you’ve ever browsed the U.S. Census population statistics pages, you’ve no doubt come across the famous population pyramid that they so frequently use to display the distribution of the U.S. population by age and gender. I was reading up

Posted in data visualization Tagged with: , , ,

About this blog

The data visualizations on this blog are the result of my "data tinkering" hobby, where I tackle a new data analysis problem every weekend. If I find something interesting, I report my findings here to share with the world.

If you would like to use one of my graphs on your website or in a publication, please feel free to do so with attribution, but I would appreciate it if you email me first to let me know.

Archives

Subscribe

Enter your email address to subscribe to this blog and receive notifications of new posts by email.