The Bugs They Missed – Finding Bugs in Production

I started my working life as a programmer so long ago that I was writing BCPL on a PDP-11 with no Copilot to help me. There was no Internet, so I couldn’t go to StackOverflow to copy and paste code, and there were no unit tests and no CI or regular “bug blitzes” to fix all the bugs in production. After many years of this, I was looking for a change and a move to testing and quality assurance (QA) appealed to me. I found a book on testing that blew my mind because I found all these ways to test a program that we were not doing.

Customers do it better.

I wanted to do some of this but met opposition from higher levels of management who thought agile meant giving the program a quick look over before shipping because “customers are better at finding bugs than we are.”

I’ve posted this quote many times and, sometimes, get a reply of, “Well, it’s true.”.

Yes, customers will use programs in ways never imagined. When email came out, was spam ever considered? Did Samsung or Porsche consider that Doom could be played on their washing machines and cars?

However, customers won’t find the bugs already found (and fixed) during your testing, so there is a valid reason for doing some before shipping.

Look at the holes.

All of this reminded me about Survivorship Bias and the story about bombers in World War II.  After each mission, the bullet holes and damage from each bomber was painstakingly reviewed and recorded. Analysts looked over the data looking for vulnerabilities. The data began to show a clear pattern. Most damage was to the wings and body of the plane. It seemed at first to be logical to reinforce those areas …

Howver, more thought showed that the holes in the returning aircraft represented areas where a bomber could take damage and still fly well enough to return safely to base.

So, what does this have to do with software and bugs?

Check those known bugs.

If, during testing, you find a bug in production that hasn’t been reported by the customer, this could mean a few things.

  1. They haven’t found it, which could mean they’re not using the area where you found the bug. Maybe add some logging to that area and see if it’s being used. And, if not, why not?
  2. Your monitoring hasn’t picked this up — add some and test to see if it does now.
  3. Ways to log the bug are bad. Is there a way for a customer to log an issue they found? And even if there is, how likely are they to actually use it?

What happens when a customer finds a bug in production?

If it is a new bug, then it can point to an area where your testing is lacking. This could be:

  1. The production environment does not match your test environment.
  2. Users aren’t using it the way you’re testing. Maybe they use a mouse and your testers only use the keyboard. So randomize your testing.
  3. They’re using different phone models and/or operating system versions than you’ve been using

Having bugs in production is never pleasant, but use them as a learning opportunity so that no one can say, “Customers are better at finding bugs than you.” It’s a good mindset to have.

Conversation

Join the conversation

Your email address will not be published. Required fields are marked *