Graphing real data with MatPlotLib
In a previous post I covered the basics of graphing in Python with the MatPlotLib module. In this post I am going to demostrate how to use MatPlotLib with some real world data retrieved from a web service and then processed into a format usable by MatPLotLib.
The example script performs the following steps:
- Takes a specified stock’s ticker symbol and column to plot over time (from Open, High, Low, Close, Volume, Adj Close) as input.
- Fetches the corresponding stock data from Yahoo! Finance and saves it into a CSV file using the urllib module.
- Processes the data in the CSV file into a suitable format for matplotlib using the csv, datetime and matplotlib.dates modules.
- Plots a graph of the data plotted over time using MatPlotLib and a saves a copy as PNG format image.
Note: To keep the example concise I am not performing any error handling.
"""Fetches specified stock data from Yahoo and graph it with MatPlotLib."""
from urllib import urlretrieve
from csv import DictReader
from matplotlib import pyplot
from matplotlib.dates import date2num
from datetime import datetime
def fetchstockdata( stockticker, filename ):
"""Fetch specified stock data and store it in named file."""
url = 'http://ichart.finance.yahoo.com/table.csv?s=%s' % stockticker
urlretrieve( url, filename )
def importstockdata( filename ):
"""Import CSV data into dict of lists, converting dates into timestamps."""
results = {}
for row in DictReader( open( filename,'rb' ) ):
for col in row.keys():
if col == 'Date':
coldata = date2num( datetime.strptime( row[col], '%Y-%m-%d') )
else:
coldata = row[col]
results.setdefault( col, [] ).append( coldata )
return results
def plotstockdata( stockdata, stockticker, dates, col ):
"""Use MatPlotLib to graph speciifed stock data."""
pyplot.plot_date( stockdata[dates], stockdata[col], '-', xdate=True )
pyplot.title( '%s - %s / %s' % (stockticker, col, dates) )
pyplot.xlabel( dates )
pyplot.ylabel( col )
pyplot.savefig( '%s.png' % stockticker )
pyplot.show()
if __name__ == '__main__':
from sys import argv
# Use second argument as ticker and third argument as column.
TICKER = argv[1].upper()
COL = argv[2]
# Grab the stock data from Yahoo!
FILENAME = '%s.csv' % TICKER
fetchstockdata( TICKER, FILENAME )
# Import the data.
DATA = importstockdata( FILENAME )
# Plot the graph with Date as X-Axis and User selected column as Y-Axis.
plotstockdata( DATA, TICKER, 'Date', COL )
Running this script with using the command line “python StockChart.py goog ‘Adj Close’” will produce a chart like the following.

This is a good example of why I like Python’s batteries included philosophy so much: it means I spend more of my time writing interesting bits of code as the utility functionality I need has already been implemented or is only an easy_install away.
Small steps to big goals
Scott McIntyre recently wrote a post for Zen Habits on ‘The Small-Scale Approach to Achieving Great Things’, I found his post particularly interesting as I’ve observed a similar pattern over the years.
For me, the stumbling blocks to success were that I allowed myself to be overcome by the size of the task and how long it would take to get there. What really helped was to break down the overall journey into smaller steps and to stop looking as far into the future.
To successfully complete a big goal requires breaking the goal down into small actionable steps. This doesn’t need to be an intricately detailed master plan, knowing as little as the next actionable step is sufficient to make progress. Without knowing at least the next actionable step it is very easy to become discouraged by the scale of the goal.
It is also easy to underestimate how much the positive feeling of ‘getting stuff done’ when completing these small steps can encourage you to continue pursuing your goal to completion. Without the feeling of making progress towards a goal it is very easy to become discouraged and give up.
Basic graphing with MatPlotLib
One of the Python modules that has most interested me recently is MatPlotLib which is a sophisticated graphing module which can be used to create journal grade graphs of almost anything. The official gallery for MatPlotLib is worth checking out to get an idea of the sheer range of graph types it can be used to create.
It is simple enough to get started using MatPlotLib for example to create a line graph of x*x and save it as a PNG file requires only the following:
"""Simple demonstration of MatPlotLib plotting.""" from matplotlib import pyplot X = range(0,100) Y = [ i*i for i in X ] pyplot.plot( X, Y, '-' ) pyplot.title( 'Plotting x*x' ) pyplot.xlabel( 'X Axis' ) pyplot.ylabel( 'Y Axis' ) pyplot.savefig( 'Simple.png' ) pyplot.show()
The above script will produce the following graph:

To plot data over a time period the simplest solution is to convert date/time units to timestamps using MatPlotLibs date2num function and then to plot using the plot_date method as follows:
"""Simple demonstration of MatPlotLib Date plotting.""" from matplotlib import pyplot from matplotlib.dates import date2num from datetime import datetime, timedelta # Generate a series of timestamps from today to today + 100 years. X = [date2num(datetime.today()+timedelta(days=365*x)) for x in range(0,100)] Y = [i*i for i in range(0,100)] pyplot.plot_date( X, Y, '-', xdate=True ) pyplot.title( 'Plotting x*x' ) pyplot.xlabel( 'X Axis' ) pyplot.ylabel( 'Y Axis' ) pyplot.savefig( 'SimpleDates.png' ) pyplot.show()
Which will generate a chart like the following:

As you can see it is fairly simple to graph data using MatPlotLib. This makes Python and MatPlotLib a compelling solution for data analysis when combined with the many available modules for dealing with common data storage formats like text (using RegEx), CSV, XML and JSON files and SQL databases.
Why work doesn’t happen at work
This talk is by Jason Fried co-founder of 37 Signals talking at TED MidWest 2010 about the hurdles ofgetting work done in the average office. Jason makes some good points about how working in offices can be counter productive due to interruptions and unnecessary meetings.
I’ve noticed a trend over the years that the most productive individuals in the office tend to either arrive before everyone else or stay late after everyone else leaves to get quiet interruption free time to get stuff done. I’ve also yet to meet a programmer who was more productive in an high interruption environment.
I know of at least one development team that has banned meetings on certain days of the week for their engineers and before I started working at home I used to block off time in my calender to prevent my days being fragmented by meetings.
Working at home I find I have the opposite problem and I need to set alarms to remind me to take breaks…
Praise for Python
- Simplified Memory management
I am so much more productive when I am not having to worry about pointer related errors e.g. pointer math or sweat the subtleties of memory management e.g. memory alignment while writing code. - Less structural syntax
After using Python for a while I really appreciate it’s use of indentation to give a program structure, as it makes python source code much more concise than C/C++. - No compiling or linking
It is so much easier to stay in the flow when your not waiting 5-30 minutes for compilation and linking. I’ve recently taken to running PyLint when I miss the feedback from a compiler/linker on my program structure and to learn the coding style outlined in the Python Style Guide. - Selective imports
Having worked on large scale C/C++ projects for most of my career I really appreciate the ability to only import what I want from modules and the option to also rename (or alias) what I’ve imported. - Batteries included philosophy
The sheer scope of the library of modules included in Python means I can spend more time writing the interesting parts of my programs, as most of the time the utility functionality I need is just an import away. - Package management
The Python Package Index (PyPi) and Setup Tools module make installing most python modules as simple as ‘easy_install <module_name>’. - Duck typing
Python’s use of Duck Typing emphasizes interfaces over types which makes it so much easier to supply my own classes to standard library functions, as I only have to implement as much of the interface as is required.
Embracing chaos
Last week there was a very public service outage in Amazon Web Services (AWS) cloud infrastructure that took down a large number of web sites and services. This post at the High Scalability blog is a very comprehensive list of articles and posts covering the AWS outage from all possible angles.
So many great articles have been written on the Amazon Outage. Some aim at being helpful, some chastise developers for being so stupid, some chastise Amazon for being so incompetent, some talk about the pain they and their companies have experienced, and some even predict the downfall of the cloud. Still others say we have seen a sea change in future of the cloud, a prediction that’s hard to disagree with, though the shape of the change remains…cloudy.
Particularly interesting is Netflix’s Chaos Monkey (see point 3) and how it has been credited with helping Netflix stay online during the outage.
One of the first systems our engineers built in AWS is called the Chaos Monkey. The Chaos Monkey’s job is to randomly kill instances and services within our architecture. If we aren’t constantly testing our ability to succeed despite failure, then it isn’t likely to work when it matters most – in the event of an unexpected outage.
Deliberately destabilising a live system to improve its robustness is counter intuitive but it seems to have paid off for NetFlix.








