The joys of premature optimisation
Tuesday, August 5th, 2003Common sense and old fashioned know-how have warned us for years about the evils of premature optimisation. The problem is that this rule of thumb is a bit like traffic laws; designed to cater to the lowest common denominator. This law assumes that the average developer is on the wrong side of the moron/competence scale, and so it’s just safer to let this little idiot go about his merry way without obfuscating his code in the name of optimisation.
During the oscache overhaul, Chris Miller and I decided that it’d be rather handy to be able to specify times for cache expiry. Simple stuff, like at every midnight during a weekday. We also had a number of user requests for it, and cron style expressions seemed a natural fit.
I previously had some written some old cron code so I was going to donate that, but I couldn’t find it and Chris decided it might be fun to write it from scratch anyway. We both looked at the Quartz scheduler’s cron code and had that by-now-familiar bleeding eyes syndrome. For fun, we both went ahead and implemented it, and decided that the fastest implementation is what will get checked in (morally, it was a nice fuck you to pair programmers everywhere too). His version beat mine (not by much, thankfully), so his was what we went with. For fun, I decided to compare it to Quartz’s cron parser, just to see how much premature optimisation can help.
I thought it might be an unfair competition, so I imposed an additional handicap; the oscache version would have to calculate a previous fire time as well as parse the expression, whereas the quartz version just had to parse. Lo and behold, the oscache version is at least an order of magnitude faster, and in average to best case expressions, can be up to a hundred times faster.
Looking over the Quartz code makes it very clear why it’s so pitifully slow. For one thing, every expression creates 7 TreeSets, an assorted smattering of StringTokenizers, and a bevy of Calendars/Dates. The code is littered with naive childish ways of doing things, ignoring the most trivial of shortcuts. For example, it uses things like Integer.parseInt(String.valueOf(c)) to convert a char to an int. Perhaps a brief glance at an ASCII table might reveal a rather obvious shortcut, given that a cron number has a small enough range that you don’t need to worry about all the things that parseInt has to.
Surprisingly, the Quartz code is also too verbose, hard to decipher, and astoundingly unclear. For code that is so time critical, it is odd to see such a cavalier attitude towards object creation and type conversions. The oscache version hardly ever uses objects, preferring instead to stick to primitive arrays.
I’m not advocating that you all go out and throw out your Strings, your hosts of Collections, or your cute and cuddly objects. Sometimes though, just sometimes, it pays to know up front what performance characteristics you’re expecting, and how much it matters, compared to all other concerns.