Insider: A Debugging Story -- Visual Studio Magazine

Insider: A Debugging Story

Peter Vogel looks at debugging from a different specific: One incident whose resolution depended, in part, on understanding who'd already searched for the bug and the history of the bug itself.

By Peter Vogel
11/03/2011

I've been doing a lot of LINQ programming recently, so my next column is probably going to be called something like "Stupid LINQ Tricks". Because LINQ, like SQL, is a declarative language you don't get a lot of code to look at when your code isn't doing what you want. But fortunately, much of your debugging effort shouldn't be obsessively examining code. I've even got a story that demonstrates the importance of understanding context, gathering evidence, doing an IS/is NOT analysis, and (the most important part) getting a good description of the bug.

I had been working at my first company for about a year when I got assigned "the negative inventory problem": For the previous two years, every once in a while an item in the company's inventory would show a negative balance. There was a report that ran every night that checked for these negative balances and it had found one for the first time in a couple of months. Several people had tried to find the problem before -- now it was my turn.

My debugging process is simple: First, get a good description of the problem and generate some preliminary theories; second, stabilize the bug (make it happen any time I want); third, fix the bug.

I started building the description by finding all the places in our applications where the inventory count was updated. I found, as I remember, three routines that set the inventory count: one set from user input and two that set it from other data in the database. I didn't spend much time looking at the code because, if that's where the problem was, I figured one of the smart, clever people who'd looked at the bug before me would have already found the problem. It was, however, input to the description.

After looking at the code, I was ready to generate some preliminary theories. I had three:

I didn't understand how the tools worked and wasn't reading the code correctly. This seemed the least likely because, while I was fresh out of school and didn't know much, the previous investigators probably did.
The inventory count was being updated somewhere else and everyone had forgotten about it. This seemed pretty likely: the system had been around for a while and source control wasn't all it could be.
Some of the data used to update the inventory balance weren't valid, and when used to update the inventory count, generated a negative number. This also seemed possible, but the system inputs made it unlikely there was bad data in the system.

Improving the Description

I needed to improve my description, so I started to do an "IS/is NOT" analysis: The bug occurs when this IS true/the bug does not occur when this is NOT true. For instance, I could have instrumented the three routines to have them write to a log whenever they executed. The next time that an inventory count went negative, I could check the log and see which routine(s) had executed on the inventory item that went negative. If none of them had executed, that would point to option two (the bug is NOT associated with any known routine) and if only one of the routines had executed, that would point to that routine (the bug IS associated with this routine).

Before I did that, I took advantage of a log we already had: the bug reports listing every incident of our negative inventory account. I graphed that data by date to see what associations I could find (for instance: the bug IS happening on Mondays; the bug is NOT happening on holidays). What I discovered was that the bug had appeared out of nowhere about two years ago and had been occurring virtually every day. Gradually, however, the incidences had tapered off until, now, we might have a month without one.

Option 3 (bad data) seemed more probable now. The best explanation I could come up with was that something had been done just before the bugs started appearing that had "seeded" the database with bad data. Initially, the amount of bad data had been high so we had an incident every day. But -- I was extending my theory here -- whatever generated the bad inventory count (or the clean-up process afterwards) also "fixed" the bad data. As a result, the amount of bad data was decreasing over time and it was taking longer to hit bad data. Since the bugs had started on a Monday, the culprit was probably some job job run over the weekend which bypassed the normal inputs.

Stabilize and Fix

As soon as I described the bug this way to the rest of the team, two of the senior programmers who knew enough of the history of the system flagged the most likely cause. In literally two or three minutes, the guys were able to identify the job that would have generated the bad data and the data that the job would have inserted.

The next step was to make the bug happen whenever we wanted: To stabilize it. The guys ran a program that found some of the remaining data inserted by the weekend job two years ago, entered a transaction that would flag some of that data for processing, and waited for the report to show a negative balance the next morning. And the next morning it did. The guys then wrote a new program that found all the remaining bad data and fixed it. We never had another incident and retired the program that checked for the problem a year later.

Interestingly enough, though, I got all the credit for finding the problem even though the two guys who recognized my description did all the real work. I didn't point out that out. I mean, I looked like a genius. Nothing wrong with that.

About the Author

Peter Vogel is a system architect and principal in PH&V Information Services. PH&V provides full-stack consulting from UX design through object modeling to database design. Peter tweets about his VSM columns with the hashtag #vogelarticles. His blog posts on user experience design can be found at http://blog.learningtree.com/tag/ui/.

Printable Format

comments powered by Disqus

Featured

Mastering Blazor Authentication and Authorization

At the Visual Studio Live! @ Microsoft HQ developer conference set for August, Rockford Lhotka will explain the ins and outs of authentication across Blazor Server, WebAssembly, and .NET MAUI Hybrid apps, and show how to use identity and claims to customize application behavior through fine-grained authorization.
Linear Support Vector Regression from Scratch Using C# with Evolutionary Training

Dr. James McCaffrey from Microsoft Research presents a complete end-to-end demonstration of the linear support vector regression (linear SVR) technique, where the goal is to predict a single numeric value. A linear SVR model uses an unusual error/loss function and cannot be trained using standard simple techniques, and so evolutionary optimization training is used.
Low-Code Report Says AI Will Enhance, Not Replace DIY Dev Tools

Along with replacing software developers and possibly killing humanity, advanced AI is seen by many as a death knell for the do-it-yourself, low-code/no-code tooling industry, but a new report belies that notion.
Vibe Coding with Latest Visual Studio Preview

Microsoft's latest Visual Studio preview facilitates "vibe coding," where developers mainly use GitHub Copilot AI to do all the programming in accordance with spoken or typed instructions.
Steve Sanderson Previews AI App Dev: Small Models, Agents and a Blazor Voice Assistant

Blazor creator Steve Sanderson presented a keynote at the recent NDC London 2025 conference where he previewed the future of .NET application development with smaller AI models and autonomous agents, along with showcasing a new Blazor voice assistant project demonstrating cutting-edge functionality.

Subscribe on YouTube

.NET Insight

Email Address*Country*

Please type the letters/numbers you see above.

Upcoming Training Events

0 AM

VSLive! 4-Day Hands-On Training Seminar: Hands-on with Blazor
May 5-8, 2025

Cybersecurity & Ransomware Live! VirtCon 2025
May 13-15, 2025

VSLive! 3-Day Hands-On Training Seminar: Master Modern JavaScript: Unlock the Full Potential of Your Code
June 2-4, 2025

VSLive! 2-Day Hands-On Training Seminar: Asynchronous and Parallel Programming in C#
June 24-25, 2025

VSLive! 4-Day Hands-On Training Seminar: Immersive .NET Full Stack Training: 4-Day Hands-On Experience
July 15-18, 2025

Visual Studio Live! @ Microsoft HQ
August 4-8, 2025

Visual Studio Live! San Diego
September 8-12, 2025

Live! 360 2-Day Hands-On Seminar: Swimming in the Lakes of Microsoft Fabric and AI – A Hands-on Experience
September 18-19, 2025

Live! 360 Orlando
November 16-21, 2025

Artificial Intelligence Live! Orlando
November 16-21, 2025

Cloud & Containers Live! Orlando
November 16-21, 2025

Cybersecurity & Ransomware Live! Orlando
November 16-21, 2025

Data Platform Live! Orlando
November 16-21, 2025

Visual Studio Live! Orlando
November 16-21, 2025

VSLive! 4-Day Hands-On Training Seminar: Immersive .NET Full Stack Training: 4-Day Hands-On Experience
December 16-19, 2025

Free Webcasts

> More Webcasts