It’s easy when writing code to see a need for a number and grab the easiest thing to use. In the case where only whole numbers are necessary, say counting something, that’s easy. You can happily grab the integer type available in your language and move on. What if you need more than just whole numbers though? Well, that’s easy too; just grab the floating point type and it’s the same deal! If you need more precision or bigger numbers, use the double type, right? What could go wrong?
The Problem with Floating-Point Numbers
Doubles (double-precision floating-point numbers) are widely available in every modern language. They can store a wide range of numbers small and large, whole or fractional. They are also easy to work with and have extensive standard library support. That’s because processors are optimized to work with floating-point numbers, so they are typically supported on a simple level. In languages with complex types, they are almost always available as a primitive type. However, they come with a very real fundamental limitation: precision.
Under the hood, floats store numbers in binary format (thus why processors deal with them so efficiently). To be specific, it stores a binary fraction and a binary exponent. When you save a number, it represents it as x * 2^y
, so it stays binary through and through. The issue is that a lot of numbers are simply not representable in this format. Even a simple number, like 0.1
, when stored as float, is actually stored as the closest representation it can muster, which ends up as 0.100000001490116119384765625
. More precision, like moving to a double, or even a quad, will make this number closer to the actual value of 0.1
, but it will NEVER be exact. You can play around with a float converter here to see this in action!
The Use Cases
In many cases, this very marginal precision error is not that important. It doesn’t matter if the car sales per hour you’re measuring in your app is slightly off. Even in cases where precision IS important, say you’re measuring latitude and longitude, a double would take you a long way. In the example above, in any case where precision up to 8 significant figures, the number truncates cleanly back to 0.1
anyway. That was just a standard float, so a double would almost certainly be enough. Even if it wasn’t, you would notice right away. A few decent test cases or a keen manual test would immediately spot values being changed if rounding errors were occurring.
However, the scariest part is when precision loss is NOT immediately obvious. This takes us to the true greatest enemy of floats and first graders alike: MATH. If you take a float, like 0.1
, and multiply it by another float, 0.1
, you’ll get a new float that is farther from 0.01
than either original float was from their respective input value of 0.1
. That’s because you actually multiplied two numbers that WEREN’T 0.1
. Every time you do this, the value strays farther and farther from what it should be. Eventually, these cumulative errors will encroach into your targeted precision, and you’ll get a rounding error. You may never even notice that it happened or how much it eventually diverges by, but it’s there.
When Problems Collide
This takes us to the title of the post. In one common situation, values are very long-lived, lots of math gets performed on them, and precision is extremely important: money. This isn’t the only scenario where this is true, but it is by FAR the most common. It may seem far-fetched to imagine this being a real problem with how small the errors are. With enough users and enough transactions per day, though, it could become a very expensive problem. That’s not even mentioning the legal complications that could result or the costs of failing software audits. The plot of a certain movie comes to mind…
What to Do
It turns out that the answer is actually quite easy: use a decimal data type. These types typically store an integer value and an integer scaler (by a power of 10 this time, though, to avoid binary issues like floats!). There are a few other ways these types can do this, but one way or another, the result is a perfect representation of numbers up to a user-specified precision value. In most cases you can even avoid rounding to this specified precision until you decide it’s time, to minimize compounding.
You also have control over any loss of precision, since you know when and where it’s happening. This is in contrast to floats, where sometimes a number is very close (like 0.1
, which is accurate within 8 significant digits) or very far (0.01
is only accurate to 3 significant digits). As a bonus here, decimal type implementations can also store extremely large values, far larger than floats or doubles (sometimes arbitrarily large!).
Because this is such a common issue across so many situations, every modern language has some kind of access to a decimal type. Python has decimal.Decimal
, C# has decimal
, Java has BigDecimal
. Even JavaScript, which has no native access to a decimal type, has plenty of well-tested and used libraries for this like bignumber.js or decimal.js. These are also very easy to use and capable of anything you would need as long as you’re willing to learn some slight differences from interacting with them vs. a primitive type.
The Takeaway
As long as you think about the precision and math requirements of a number, you should be able to pick the right tool for the job. Just remember there are almost no downsides to using decimal formats. Combined with the fact that it can be hard to predict the growth or change of an application over time, make sure to use a healthy dose of safety first when designing a system. You can’t always get back the precision loss from storing something as a double!