Introducing Shoot Stockholm!

Shoot Stockholm a photo blog.I’d like to introduce my latest personal project – Shoot Stockholm.  This is a photo blog of my  life here in Stockholm, Sweden and is a photography project I’ve wanted to do for a while now.

The idea for this site was inspired by my  friend’s David duChemin and Dave Delnea‘s mantra of ‘create things & share it’.  Dave Powell’s photo blog has also been an inspiration to share my photographs more publicly.

Now that I’m settled in Stockholm, I’m  hoping to be able to also start posting to Endlessly Curious again.  And I’ll be posting my photographs to Shoot Stockholm on a regular basis.

Five iPhone apps to consider

The five iPhone applications I use every day:
  1. Byline
    An excellant Google Reader client, perfect for reading the news during the commute to work.
  2. Instacast
    For sync free podcast listening its hard to beat Instacast, I wish I’d found it sooner!
  3. Camera+
    Apple should just buy this app to replace the built in camera application, its that good.
  4. Tweetbot
    So much better than the official twitter client it isn’t funny!
  5. Analytics Pro
    The best Google Analytics client I’ve found so far, very handy for checking up on your sites on the go!
17 Feb 2012, 8:00am
Miscellaneous:
by

leave a comment

Goodbye Vancouver. Hej Stockholm!

At the start of February I moved from Vancouver, Canada to Stockholm, Sweden!

Last year was a traumatic year for me with the death of my beloved wife Barbara to brain cancer in July.  As five of the six years we were married were spent in Vancouver there are many happy memories there but there are unpleasant memories too.

I felt it was time for a new adventure so the opportunity to work with the awesome folk at DICE on the Frostbite engine was an opportunity that I felt I’d regret if I didn’t give it a shot.  Plus I’d always wanted to learn another language, as something about thinking in multiple languages has always intrigued me.

Leaving the friends I’ve made  and the colleagues I’ve worked with in the five years I spent in Vancouver behind has been hard.  Yet exploring Stockholm and meeting new people has been interesting so far!

Top posts of 2011

The top ten posts for 2011 according to Google Analytics were:

  1. Installing Python, MatPlotLib & iPython on Snow Leopard.
  2. Finding duplicate files using Python.
  3. Getting started with Python.
  4. Praise for Python.
  5. Basic Graphing with MatPlotLib.
  6. Graphing real data with MatPlotLib.
  7. Extracting image EXIF data with Python.
  8. Python 2.7.1 Goodness.
  9. Running WordPress on Mac OS X with XAMPP.
  10. John Cleese on creativity.

Eight of the top ten are Python related, the top twenty is more diversified:

  1. Querying Reddit with Python.
  2. Barbara.
  3. Processing Perforce command output with Python.
  4. Downloading wallpaper images from Reddit using Python.
  5. Why scrum fails
  6. Hacking work manifesto.
  7. The ascendeancy of JSON.
  8. Using Perforce counters to control syncing.
  9. Why work doesn’t happen at work.
  10. Small steps to big goals.

On a personal note I hope 2012 will bring more posts and less personal tragedy..

2 Jan 2012, 1:00am
Programming:
by

5 comments

Organising photographs with Python

Previously I posted about extracting EXIF information from images using the Python Image Library (PIL).  The reason I was investigating how to do this was I wanted to programmatically reorganise my personal photograph collection from its current ad-hoc mess to something more structured.

My goal was to use Python to extract the EXIF information from each image file and use the creation time of each image as key to organise each image into the directory structure Year/Month/Day.  If an image file is missing EXIF data then the file’s creation time can be used instead via an option.

An example of running this script to reoranise the photos folder and leave the original files in place would be:

python PhotoShuffle.py -copy /Daniel/Pictures /Daniel/OrganisedPictures

You can also find the latest version on github at github.com/dpbrown/PhotoShuffle, the following is the current script:

"""Scans a folder and builds a date sorted tree based on image creation time."""

if __name__ == '__main__':
    from os import makedirs, listdir, rmdir
    from os.path import join as joinpath, exists, getmtime
    from datetime import datetime
    from shutil import move, copy2 as copy
    from ExifScan import scan_exif_data
    from argparse import ArgumentParser

    PARSER = ArgumentParser(description='Builds a date sorted tree of images.')
    PARSER.add_argument( 'orig', metavar='O', help='Source root directory.')
    PARSER.add_argument( 'dest', metavar='D',
                         help='Destination root directory' )
    PARSER.add_argument( '-filetime', action='store_true',
                         help='Use file time if missing EXIF' )
    PARSER.add_argument( '-copy', action='store_true',
                         help='Copy files instead of moving.' )
    ARGS = PARSER.parse_args()

    print 'Gathering & processing EXIF data.'

    # Get creation time from EXIF data.
    DATA = scan_exif_data( ARGS.orig )

    # Process EXIF data.
    for r in DATA:
        info = r['exif']
        # precidence is DateTimeOriginal > DateTime.
        if 'DateTimeOriginal' in info.keys():
            r['ftime'] = info['DateTimeOriginal']
        elif 'DateTime' in info.keys():
            r['ftime'] = info['DateTime']
        if 'ftime' in r.keys():
            r['ftime'] = datetime.strptime(r['ftime'],'%Y:%m:%d %H:%M:%S')
        elif ARGS.filetime == True:
            ctime = getmtime( joinpath( r['path'], r['name'] + r['ext'] ))
            r['ftime'] = datetime.fromtimestamp( ctime )

    # Remove any files without datetime info.
    DATA = [ f for f in DATA if 'ftime' in f.keys() ]

    # Generate new path YYYY/MM/DD/ using EXIF date.
    for r in DATA:
        r['newpath'] = joinpath( ARGS.dest, r['ftime'].strftime('%Y/%m/%d') )

    # Generate filenames per directory: 1 to n+1 (zero padded) with DDMMMYY.
    print 'Generating filenames.'
    for newdir in set( [ i['newpath'] for i in DATA ] ):
        files = [ r for r in DATA if r['newpath'] == newdir ]
        pad = len( str( len(files) ) )
        usednames = []
        for i in range( len(files) ):
            datestr = files[i]['ftime'].strftime('%d%b%Y')
            newname = '%0*d_%s' % (pad, i+1, datestr)
            j = i+1
            # if filename exists keep looking until it doesn't. Ugly!
            while ( exists( joinpath( newdir, newname + files[i]['ext'] ) ) or
                newname in usednames ):
                j += 1
                jpad = max( pad, len( str( j ) ) )
                newname = '%0*d_%s' % (jpad, j, datestr)
            usednames.append( newname )
            files[i]['newname'] = newname

    # Copy the files to their new locations, creating directories as requried.
    print 'Copying files.'
    for r in DATA:
        origfile = joinpath( r['path'], r['name'] + r['ext'] )
        newfile = joinpath( r['newpath'], r['newname'] + r['ext'] )
        if not exists( r['newpath'] ):
            makedirs( r['newpath'] )
        if not exists( newfile ):
            if ARGS.copy:
                print 'Copying '+ origfile +' to '+ newfile
                copy( origfile, newfile )
            else:
                print 'Moving '+ origfile +' to '+ newfile
                move( origfile, newfile )
        else:
            print newfile +' already exists!'

    if ARGS.copy:
        print 'Removing empty directories'
        DIRS = set( [ d['path'] for d in DATA ] )
        for d in DIRS:
            # if the directory is empty then delete it.
            if len( listdir( d ) ) == 0:
                print 'Deleting dir ' + d
                rmdir( d )

UPDATE: I tend to run my duplicate file script over image collections before I organise them to remove any duplicates. You can find that script on github at github.com/dpbrown/Duplicate-Files.

31 Dec 2011, 1:00am
Programming:
by

1 comment

Downloading Wallpaper Images from Reddit with Python

In my previous post I demonstrated how to query Reddit using Python and JOSN. My goal was a script to download the latest and greatest wallpapers off of image sub-reddits like wallpaper to keep my desktop wallpaper fresh and interesting. The main function of the script is to download any JPEG formatted image that listed in the specified sub-reddit and download them to a folder.

Allot of the script turned out to be managing URLs, handling exceptions and checking image types so that links to the most commonly encountered image repository: imgur worked. I opted to use the reddit hash id for each post as the filename for the downloaded JPEGs as this seems to be unique value, which means there are no collisions and its easy to programatically check if that item’s image has already been download or not. Although using a hash value instead of the items text title doesn’t make the most memorable filenames..

The single most frustrating thing I encountered when writing this script is that I have yet to discover a programatic way to work out the URL for an image on Flickr given a Flickr page URL. This is a real shame as Flickr is a really popular image hosting site with allot of great images.

An example of running the script to download images with a score greater than 50 from the wallpaper sub-reddit into a folder called wallpaper would be as follows:

python redditdownload.py wallpaper wallpaper -s 50

And to run the same query but only get any new images you don’t already have, run the following:

python redditdownload.py wallpaper wallpaper -s 50 -update

You can find the source code for this post (and the previous) on GitHub at github.com/dpbrown/RedditImageGrab and the current source for the script is as follows:

"""Download images from a reddit.com subreddit."""

from urllib2 import urlopen, HTTPError, URLError
from httplib import InvalidURL
from argparse import ArgumentParser
from os.path import exists as pathexists, join as pathjoin
from os import mkdir
from reddit import getitems

if __name__ == "__main__":
    PARSER = ArgumentParser( description='Downloads files with specified externsion from the specified subreddit.')
    PARSER.add_argument( 'reddit', metavar='r', help='Subreddit name.')
    PARSER.add_argument( 'dir', metavar='d', help='Dir to put downloaded files in.')
    PARSER.add_argument( '-last', metavar='l', default='', required=False, help='ID of the last downloaded file.')
    PARSER.add_argument( '-score', metavar='s', default='0', type=int, required=False, help='Minimum score of images to download.')
    PARSER.add_argument( '-num', metavar='n', default='0', type=int, required=False, help='Number of images to process.')
    PARSER.add_argument( '-update', default=False, action='store_true', required=False, help='Run until you encounter a file already downloaded.')
    ARGS = PARSER.parse_args()

    print 'Downloading images from "%s" subreddit' % (ARGS.reddit)

    ITEMS = getitems( ARGS.reddit, ARGS.last )
    N = D = E = S = F = 0
    FINISHED = False

    # Create the specified directory if it doesn't already exist.
    if not pathexists( ARGS.dir ):
        mkdir( ARGS.dir )

    while len(ITEMS) > 0 and FINISHED == False:
        LAST = ''
        for ITEM in ITEMS:
            if ITEM['score'] < ARGS.score:
                print '\tSCORE: %s has score of %s which is lower than required score of %s.' % (ITEM['id'],ITEM['score'],ARGS.score)
                S += 1
            else:
                FILENAME = pathjoin( ARGS.dir, '%s.jpg' % (ITEM['id'] ) )
                # Don't download files multiple times!
                if not pathexists( FILENAME ):
                    try:
                        if 'imgur.com' in ITEM['url']:
                            # Change .png to .jpg for imgur urls.
                            if ITEM['url'].endswith('.png'):
                                ITEM['url'] = ITEM['url'].replace('.png','.jpg')
                            # Add .jpg to imgur urls that are missing it.
                            elif '.jpg' not in ITEM['url']:
                                ITEM['url'] = '%s.jpg' % ITEM['url']
                            elif '.jpeg' not in ITEM['url']:
                                ITEM['url'] = '%s.jpg' % ITEM['url']

                        RESPONSE = urlopen( ITEM['url'] )
                        INFO = RESPONSE.info()

                        # Work out file type either from the response or the url.
                        if 'content-type' in INFO.keys():
                            FILETYPE = INFO['content-type']
                        elif ITEM['url'].endswith( 'jpg' ):
                            FILETYPE = 'image/jpeg'
                        elif ITEM['url'].endswith( 'jpeg' ):
                            FILETYPE = 'image/jpeg'
                        else:
                            FILETYPE = 'unknown'

                        # Only try to download jpeg images.
                        if FILETYPE == 'image/jpeg':
                            FILEDATA = RESPONSE.read()
                            FILE = open( FILENAME, 'wb')
                            FILE.write(FILEDATA)
                            FILE.close()
                            print '\tDownloaded %s to %s.' % (ITEM['url'],FILENAME)
                            D += 1
                        else:
                            print '\tWRONG FILE TYPE: %s has type: %s!' % (ITEM['url'],FILETYPE)
                            S += 1
                    except HTTPError as ERROR:
                            print '\tHTTP ERROR: Code %s for %s.' % (ERROR.code,ITEM['url'])
                            F += 1
                    except URLError as ERROR:
                            print '\tURL ERROR: %s!' % ITEM['url']
                            F += 1
                    except InvalidURL as ERROR:
                            print '\tInvalid URL: %s!' % ITEM['url']
                            F += 1
                else:
                    print '\tALREADY EXISTS: %s for %s already exists.' % (FILENAME,ITEM['url'])
                    E += 1
                    if ARGS.update == True:
                        print '\tUpdate complete, exiting.'
                        FINISHED = True
                        break
            LAST = ITEM['id']
            N += 1
            if ARGS.num > 0 and N >= ARGS.num:
                print '\t%d images attempted , exiting.' % N
                FINISHED = True
                break;
        ITEMS = getitems( ARGS.reddit, LAST )

    print 'Downloaded %d of %d (Skipped %d, Exists %d)' % (D, N, S, E)