July 19th, 2009
Each week in our alpha post, we briefly list the latest changes to Overgrowth. Some weeks we have a bunch of cool new features to point out, but other weeks it's a bit harder to see what has changed. This is partly because adding new features is often just the icing on the cake. A lot goes on behind the scenes. This includes groundwork, maintenance, optimization, documentation, code cleaning, and debugging. Sometimes, the most important work of our week is lumped at the end of the alpha post in the unassuming catchall, "Bug fixes."
This week I've been working on a number of bugs, mostly involving editor groups. So, I thought I'd walk through my encounter with a particularly sneaky bug. One that has been hiding at the core of the editor since November.
Noticing the bug
The first step to fixing a bug is noticing that something is wrong. This can be harder than it may seem. In fact, a lot of companies have whole divisions of engineers dedicated to testing and validating software. Since November, I had noticed several times that rotation didn't feel quite right. The mouse controls were a bit sticky. I sensed a bug creeping in the crevasses, but decided to ignore it as long as it stayed out of my way.
Unfortunately, a new feature, object grouping, came along and pulled this bug right out into the light. With individual objects, a small error in rotation is hard to notice. But, with grouped objects, any slight divergence in orientation breaks the arrangement and really stands out.
Isolating the source
Having noticed that something was wrong, I started looking for the problem in the group code. Half the battle in debugging is figuring out where the bug starts. Often bugs will have numerous and varied effects far downstream from their source. To isolate the source of a bug, I usually start by looking at a particular symptom (e.g. group rotation is broken), and then begin heading upstream.
The group code looked okay to me, so I paddled upstream a bit and examined regular multi-object rotation. Now that I knew what sorts of errors to look for, it was easy to see that the regular multi-object rotation was also misbehaving. And from there, it was a quick step to realize the problem existed for all object rotations, from the ground up.
The better you know your project, the easier it is to diagnose symptoms and jump directly to the source of the bug. Through years of training, medical doctors learn to suspect a certain ailment when they observe certain symptoms. For example, if a patient experiences numbness on one side of the body, confusion, and headaches, the doctor may suspect a stroke. I sometimes find myself learning similar rules for the Phoenix engine's 'biology.' For example, if I observe massive structural trauma to my carefully stacked tower of boxes, I may suspect an error in transforming between vector spaces.
In the present case, the structural trauma was minor (maybe a few dislocated joints and hyper-extended ligaments), but it was enough for me to take a second look at my transformation code. Something was definitely off, but I wasn't quite sure what. It was time to take out the tools.
Deducing the cause
Perhaps the simplest and most reliable debugging tool is the printout statement. With this technique, you pinpoint a variable you are curious about, and then simply print its value onto the screen. Then, you can watch as the value changes in real time. Scanning the printouts, it's easy to catch problematic, outlier values.
For the buggy rotations, I put in a few printouts to report the values of the relevant vectors involved. As I rotated objects about, all the printed values fell in a reasonable range. More thorough analysis was required.
Often, the best way to see what is wrong with a vector, is to just draw the vector. For example, if two vectors are meant to be perpendicular, it's a lot easier to see that in graphics than it is to recognize in a printed out list.
For the rotation bug, however, I decided to skip straight to the most intense type of debugging: using an actual debugger. Debuggers are programs that allow you to pause execution of an application and check out how things look under the hood. Once the app is paused, you can move the app forward one line of code at a time. In this fashion, I carefully 'stepped' through each line of my rotation handler.
At this point, I was pretty sure I was at the right section of code, but it was still hard for me to see exactly what was wrong with it. Bugs tend to be like this. Even when they are right in front of you, they can be easy to overlook. So, I tried to figure out what I could possibly be overlooking. This part of debugging is more art than science. For me it requires questioning each of my assumptions and coming at the code from as many new angles as I can think. Eventually, one angle clicks, and all at once, the cause of the bug hits me.
Having convinced myself that the rotation trigonometry must be correct, that the input parameters were in perfect working order, and that the handler algorithm could not be wrong, I began to double-check my assumptions. In order to rotate an object, I first transform the axis of rotation into the vector space of that object. During this process, the axis gets scaled. Shortly after, I re-normalize the axis (setting its overall size back to 1). Without thinking, I had assumed that since I re-normalized the axis, the prior scaling didn't matter. And so, I became blind to any possible affect the scale might have. Unfortunately, the overall magnitude of the scale was not all that mattered. Non-uniform scaling could squish the axis of rotation, and thereby change its direction in addition to its size. This squished axis lead to a jostled rotation, which percolated all the way up into the symptom I had initially noticed: mangled groups.
Fixing the bug
Once you understand exactly what is going wrong, fixing a bug is usually trivial. To fix the rotations, I simply removed scale from being applied to axes of rotation. Since this fix was deep in the code, a number of symptoms downstream are now cured! Group rotations are fixed, free (left-click) rotation is smoother and more precise, and a subtle multi-object rotation issue is now also working.
In many people's mind (my own included), debugging can be the worst combination of drudgery and frustration. It's certainly not a glamorous activity. But, strangely enough, we are thrilled when Dr. House diagnoses an obscure malady, and we wish it were we, rather than Sherlock Holmes, who solved the case of the Hound of the Baskervilles. Is debugging code really so different from diagnosing illness and solving crimes? We cure our code of its data infections and track down memory thieves and procedural delinquents. So why do we find debugging so much less captivating than medicine or crime fighting?
How do you guys approach debugging? Have you come across any nifty techniques or tools of the trade?