Blog Archives

Republican-leaning states tend to have more traffic deaths

Back in 2014, the U.S. Department of Transportation released a report on the (normalized) number of traffic deaths in each U.S. state. As I looked through the list, I noticed an odd correlation between the political leanings of a state

Posted in data visualization Tagged with: , , ,

Spurious Extrapolations: What if U.S. college tuition costs keep rising?

For this post, I’m going to test run a new post series called Spurious Extrapolations, where I extrapolate time series far beyond reason and envision what would happen if the trend continued. Let me know what you think of the

Posted in analysis, data visualization Tagged with: , ,

A short guide to using statistics in Evolutionary Computation

Plot showing average fitness with a 95% confidence interval

A couple weeks ago, I attended the Genetic and Evolutionary Computation Conference (GECCO) for the first time. While I was perusing through the workshops and tutorials available in the first couple days of the conference, I noticed something peculiar: There

Posted in research, statistics, tutorial Tagged with: , , , ,

Filling in Python’s gaps in statistics packages with Rmagic

Have you ever found yourself searching for a statistics package in Python, but it just isn’t available? This is the biggest reason I’ve heard when my colleagues say they’re unwilling to make the switch from R to Python for statistical

Posted in ipython, productivity, python, statistics, tutorial Tagged with: , , , , , ,

Statistical analysis made easy in Python with SciPy and pandas DataFrames

I finally got around to finishing up this tutorial on how to use pandas DataFrames and SciPy together to handle any and all of your statistical needs in Python. This is basically an amalgamation of my two previous blog posts

Posted in ipython, productivity, python, statistics, tutorial Tagged with: , , , , , , , , , , , , , , , ,

Using pandas DataFrames to process data from multiple replicate runs in Python

Per a recommendation in my previous blog post, I decided to follow up and write a short how-to on how to use pandas to process data from multiple replicate runs in Python. If you do research like mine, you’ll often

Posted in python, statistics, tutorial Tagged with: , , , , ,

A short demo on how to use IPython Notebook as a research notebook

As promised, here’s the IPython Notebook tutorial I mentioned in my introduction to IPython Notebook. Downloading and installing IPython Notebook You can download IPython Notebook with the majority of the other packages you’ll need in the Anaconda Python distribution. From

Posted in ipython, productivity, statistics, tutorial Tagged with: , , , , , , , , , , , , ,

About this blog

This blog is my labor of love, and I've spent hundreds of hours working on the projects that you'll read about here. Generally, I write about data visualization and machine learning, and sometimes explore out-of-the-box projects at the intersection of the two. I hope you enjoy my projects as much as I have.

If you would like to use one of my graphs on your website or in a publication, please feel free to do so with appropriate attribution, but I would appreciate it if you email me first to let me know.

Archives

Subscribe

Enter your email address to subscribe to this blog and receive notifications of new posts by email.