Skip to content

Red herrings in Uni-Code

Something that bit me recently at work, beware visual red herrings!

I just spent a fair while trying to figure out why Unicode character Í (0×00CD, upper case i with acute) wasn’t loading and appearing correctly in an application, only to discover two days later after pulling all the associated systems apart trying to find the bug, that it wasn’t a bug: we’d identified the wrong character! The character we’d actually been getting was ĺ (0×013A, lower case L with acute) which looks very close to what we thought the character we were getting Í, note the font used in this blog makes the differences allot more obvious.

We’d identified the problem character from the source document which was an excel spreadsheet so we had no way (that we knew of) to check the actual Unicode value of the character. This meant the programmer initially investigating the bug manually identified the character in question by going through the Windows Character Map application (Start->My Programs->Accessories->System Tools->Character Map, iirc) until he found a character that looked like the one he thought he was seeing in excel and as 0×00CD is lower than 0×13A the first character he encountered that looked the same was 0×00CD not the correct 0×013A. So the unicode value was never verified in the debugger…

The moral of the story? Verify the identity & specifics of the problem in your application before you start pulling stable libraries to pieces looking for a problem that isn’t there. Be especially weary if suddenly just one block of data out of tens of thousands start exhibiting incorrect behavior, as that suggests an error with that block rather than the library, as I’d expect more than one occurrence in an library that is handling tens of thousands of strings.

Instead it took me too long to get round to the tedious chore of syncing up to the application code base, building it and deploying it so I could test in the application before I discovered this error. Although it has to be said I was avoiding debugging the application itself as it takes at least a day of watching text scroll by in syncs and builds before you can start debugging, which is tedious at best.

We have resolved the issue now but its a good example I think of a red herring.

Post a Comment

Your email is never published nor shared. Required fields are marked *
*
*