Fun with the Python Reddit API Wrapper and word clouds

I got bored today and threw together some Python code to scrape word frequencies from Reddit and make word clouds. Everyone on Reddit seemed to love them, so I put them up on github so everyone could start making their own word clouds. All that’s really left to do is connect these scripts to a word cloud generating library so we don’t even have to copy & paste text into Wordle any more. If you’re up to the task, please email me and fork away on github.

Making word clouds for subreddits is a surprisingly effective way to get a gist for what a subreddit is really talking about. Take /r/evolution, for example. They’re serious business about evolution.


Others were more amusing. /r/trees, for example, seems to be preoccupied with cursing about things.


whereas /r/aww can be concisely described by “upvote cats, fuck humans.”


Even the /r/space nerds seem to get riled up when discussing NASA, terraforming, and meteorites.


Come join in on the fun and make some word clouds for your favorite subreddit:

Dr. Randy Olson is a Senior Data Scientist at the University of Pennsylvania, where he develops state-of-the-art machine learning algorithms with a focus on biomedical applications.

Posted in analysis, python, reddit Tagged with: , , ,

About this blog

This blog is my labor of love, and I've spent hundreds of hours working on the projects that you'll read about here. Generally, I write about data visualization and machine learning, and sometimes explore out-of-the-box projects at the intersection of the two. I hope you enjoy my projects as much as I have.

If you would like to use one of my graphs on your website or in a publication, please feel free to do so with appropriate attribution, but I would appreciate it if you email me first to let me know.



Enter your email address to subscribe to this blog and receive notifications of new posts by email.