Fun with the Python Reddit API Wrapper and word clouds

I got bored today and threw together some Python code to scrape word frequencies from Reddit and make word clouds. Everyone on Reddit seemed to love them, so I put them up on github so everyone could start making their own word clouds. All that’s really left to do is connect these scripts to a word cloud generating library so we don’t even have to copy & paste text into Wordle any more. If you’re up to the task, please email me and fork away on github.

Making word clouds for subreddits is a surprisingly effective way to get a gist for what a subreddit is really talking about. Take /r/evolution, for example. They’re serious business about evolution.


Others were more amusing. /r/trees, for example, seems to be preoccupied with cursing about things.


whereas /r/aww can be concisely described by “upvote cats, fuck humans.”


Even the /r/space nerds seem to get riled up when discussing NASA, terraforming, and meteorites.


Come join in on the fun and make some word clouds for your favorite subreddit:

Dr. Randy Olson is a postdoctoral researcher at the University of Pennsylvania. As a member of Prof. Jason H. Moore's research lab, he studies biologically-inspired AI and its applications to biomedical problems.

Posted in analysis, python, reddit Tagged with: , , ,

About this blog

The data visualizations on this blog are the result of my "data tinkering" hobby, where I tackle a new data analysis problem every weekend. If I find something interesting, I report my findings here to share with the world.

If you would like to use one of my graphs on your website or in a publication, please feel free to do so with attribution, but I would appreciate it if you email me first to let me know.



Enter your email address to subscribe to this blog and receive notifications of new posts by email.