RE: Teaching students memory managment

Jani Hartikainen has written an excellent post in reply to my earlier post about teaching software engineering students memory management, and his post is well worth a read.  I started off writing a comment on his post as a reply but I ended up writing more than I expected as I refined my ideas.

I agree teaching a higher level language with built in memory management as a first programming language is the more humane option as far as first time students are concerned.  As learning your first programming language and all the associated concepts is hard enough without all the nasty memory related gotchas in a language like C or C++.  Although the nice thing about learning something like C or php (as Jani suggests) is that teaching object orientation can be avoided initially, as that particular concept does seem to be something that some students struggle with a lot the first time they encounter it.

However I do think a low level language that has manual memory management should be at least experienced by every programmer, as it is a fundamental concept of programming effectively.  And I believe that having minimal experience with some form of manual memory management would help most programmers write higher performance programs.  But perhaps I was being a bit over zealous recommending that everyone learns a language like C/C++ as a first language, although I think at least a short course featuring a language with manual memory management would be invaluable to all programmers.  The course would not even need to cover object orientation as you can teach that in higher level languages that are easier to manipulate than C++: the key point of the course would be to teach memory management and its implications for writing fast high quality software.

As much as I go on about knowing assembler, I admit its not something I write very often at all, however it is something that is incredibly handy to be able to read and comprehend. Even the most basic level of understanding of the instructions for loading data from or to memory and registers, branching and basic math operations would allow you to check the compiler has actually generated the assembly code you expected. This is especially useful for debugging unexpectedly slow code in compiler optimised builds.  You also don’t need to know about all the fancy vendor specific assembly instructions, as I’ve mentioned the basics are usually sufficient to be able to understand roughly what a slow piece of code is doing to then rewrite the higher level source code in a way that prompts compiler to generate faster assembly code.  Actually writing assembly code should always be the last resort and only done by experts after all other higher level refactorings are attempted, as higher level optimisation or refactoring work is usually more effective and easier to understand.  Also, assembler is not usually easily portable to other hardware platforms and most programmers find it harder to debug assembler than normal C/C++ source code.  Plus as Jani mentions it is scary to find a block of inline assembler in the C++ program you are working as it is much harder to decipher than regular source code unless it is very well documented.

This ability to check what is going on ‘under the hood’ of your language is essential for those hard to track down bugs and when optimising your application for performance, especially where the compiler has reordered the program flow or generated unexpectidally slow code. This can also be applied to high level languages as well as C/C++: checking that the IL byte code generated by your C# compiler or the Java byte code generated by the JVM is doing what you expected can be very useful in understanding your program’s execution and performance.

Teaching students memory managment

Yesterday morning I spent a fascinating hour or so in a meeting listening to a very senior engineer give and in depth presentation about the performance characteristics of the low level memory systems on a console and the content of the presentation got me thinking about how software engineering is taught.  Specifically are students being taught enough about low level languages and the hardware, so that they have some experience of what is happening under the hood of these fancy modern memory managed languages like Java, C#, php, python or ruby?  If students are not being taught assembler or languages where you need to think about managing memory and the underlying hardware e.g. C or C++ how can they then be expected to program effectively in a higher level language?  As systems with finite resources like consoles have memory systems that are one of the biggest potential bottlenecks on the platform and also one of the most subtle to understand.

It used to be the case that students would be first taught assembler then a language like C before moving on to a higher level object orientated language like C++ or Java.  Although this was even starting to change when I went to university: the mainstream computer science students were being taught in Java from the start.  Although the course I was on started in C with some assembler plus hardware theory and then onto C++ but this was the exception.  Plus I’d already encountered assembler in high school too, which I’d imagine is no longer the case in high school computing classes today. Talking to the students we have had on placement at work recently, it sounds like C# and Java are now the languages that are mainly being taught.  This means only the theory of memory management and hardware concepts are being taught with no practical experience to reinforce that knowledge.

Why is this a big deal?  Well if a programmer does not understand how the system his high level language is running upon works, how can they then be expected to program that high level language in a way that takes advantage of the available hardware and system architecture?  Or at a minimum avoid the bottlenecks inherent on the particular hardware configuration they are working on?  For example how could they appreciate the potential performance implications of accessing a variable length array vs a fixed length array of structures in a key inner loop if they do not understand how these data types have been implemented in the underlying language libraries or how their underlying hardware system will process it?

Maybe this is something to do with our high performance requirements and that our platforms are closed systems with more finite resources compared to a common PC but we see this sort of thing a lot especially with new software engineers straight out of university.  The lack of understanding of the implications of the performance characteristics of our programming languages and our platform(s) hardware, leads these new software engineers to write code that is not only not as sound as a more senior engineer (which is to be expected) but also with much poorer performance characteristics.

This is not a problem with our students however, it is a problem with how they are taught.  I think part of the reason is that it is harder to teach someone a language like assembler or C or even C++ than a managed language like Java or C#.  As thinking about low level concepts like memory and cache management is not easy and concepts like pointers, memory alignment or functions pointers are not intuitive for many students, so I’d imagine it can be frustrating to teach those classes. And it is easier to teach object orientation in a language like C# or Java than C++ but that mean that we should skip straight to the memory manged high level languages?

So without fundamental knowledge of the basics that experience with lower level languages supply, it is very hard for that student to later write high performance applications without having to go back and learn the basics in their own time.  And I believe writing high performance software is something most software engineers will be required to do even in this Web 2.0 generation.  As anyone who has had to make a website scale to thousands of concurrent users can no doubt atest to the fact that performance is not just for those of us that program the more traditional high performance systems…