About this Dataroom
As a long time redditor, I've often wondered about the formula for front page success. Nothing I've ever submitted has even come close!
I decided to write a crawler that would use the Reddit API to gather information about the text of the front page stories, and store it in a local database for processing. After a couple of weeks the data started to level out.
This dataset contains a bunch of fun stats I extracted from the titles titles, including:
- the number of characters
- number of words
- readability indices (for example, Grade school level!)
Click on Data Preview to see the results!
I've also included the source code to the crawler. I was surprised how easy it was to create. Check the attachments tab.
I tried submitting it to reddit but nobody even voted it up or down. Sometimes I wonder if my account is flagged as spam and anything I submit just goes invisible.