Querying Reddit with Python
I’ve long been a fan of reddit: which is a social news site where users can submit news, they can also comment and vote on submissions of other users. Reddit provides a form of content filtration though subreddits which are specialized by topic e.g. the Python programming language.
I thought it would be fun to figure out how to get the most recent items for a particular subreddit and the previous items for an item in a subreddit. Both these things turned out to be really simple using existing Python packages to query reddit and process the JSON formatted response.
"""Return list of items from a sub-reddit of reddit.com.""" from urllib2 import urlopen, HTTPError from json import JSONDecoder def getitems( subreddit, previd=''): """Return list of items from a subreddit.""" url = 'http://www.reddit.com/r/%s.json' % subreddit # Get items after item with 'id' of previd. if previd != '': url = '%s?after=t3_%s' % (url, previd) try: json = urlopen( url ).read() data = JSONDecoder().decode( json ) items = [ x['data'] for x in data['data']['children'] ] except HTTPError as ERROR: print '\tHTTP ERROR: Code %s for %s.' % (ERROR.code, url) items =  return items if __name__ == "__main__": print 'Recent items for Python.' ITEMS = getitems( 'python' ) for ITEM in ITEMS: print '\t%s - %s' % (ITEM['title'], ITEM['url']) print 'Previous items for Python.' OLDITEMS = getitems( 'python', ITEMS[-1]['id'] ) for ITEM in OLDITEMS: print '\t%s - %s' % (ITEM['title'], ITEM['url'])
In my next post I’ll detail what I used this script for..
Both comments and pings are currently closed.