30 Nov 2011, 1:00am
Programming:
by

2 comments

Querying Reddit with Python

I’ve long been a fan of reddit: which is a social news site where users can submit news, they can also comment and vote on submissions of other users.  Reddit provides a form of content filtration though subreddits which are specialized by topic e.g. the Python programming language.

I thought it would be fun to figure out how to get the most recent items for a particular subreddit and the previous items for an item in a subreddit. Both these things turned out to be really simple using existing Python packages to query reddit and process the JSON formatted response.

"""Return list of items from a sub-reddit of reddit.com."""

from urllib2 import urlopen, HTTPError
from json import JSONDecoder

def getitems( subreddit, previd=''):
    """Return list of items from a subreddit."""
    url = 'http://www.reddit.com/r/%s.json' % subreddit
    # Get items after item with 'id' of previd.
    if previd != '':
        url = '%s?after=t3_%s' % (url, previd)
    try:
        json = urlopen( url ).read()
        data = JSONDecoder().decode( json )
        items = [ x['data'] for x in data['data']['children'] ]
    except HTTPError as ERROR:
        print '\tHTTP ERROR: Code %s for %s.' % (ERROR.code, url)
        items = []
    return items

if __name__ == "__main__":

    print 'Recent items for Python.'
    ITEMS = getitems( 'python' )
    for ITEM in ITEMS:
        print '\t%s - %s' % (ITEM['title'], ITEM['url'])

    print 'Previous items for Python.'
    OLDITEMS = getitems( 'python', ITEMS[-1]['id'] )
    for ITEM in OLDITEMS:
        print '\t%s - %s' % (ITEM['title'], ITEM['url'])

In my next post I’ll detail what I used this script for..

You can also try the [reddit_api](https://github.com/mellort/reddit_api).

Thanks for the link, I figured there was already a package for this. It was entertaining to write the tiny piece I needed myself as a learning exercise.

 

Leave a Reply

Your email address will not be published. Required fields are marked *

*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>