Richard Pryor Stole a lot of Half Cents and Bought a Ferrari
Why Floats and Decimals as Money Make Illegal States Representable
You've heard that one before: never represent money with a float
.
Floats are good for science because they can represent humongous numbers in small memory spaces. The trick? Because they use base 2 internally, they're approximations. For anything that requires precision, floats are a recipe for disaster.
And precision is, precisely, all that matters when it comes to money.
Given how often we use software to deal with money, I've always been shocked by the fact that no major programming language has a built-in money type.
You would expect otherwise, right? I guess it has to do with the fact that, in most cases, money is always dealt with in the same currency throughout a single application (the so-called reporting currency).
But commerce is now e-commerce, and dollars aren't as globally relevant as they used to. Money software nowadays has to deal with many currencies, all while keeping up with the increasing volume and speed. And it all has to add up to the penny!
Implementing your system using a single reference currency is no longer a good idea.
This is why one of the most valuable questions I ask in interviews to potential payments engineers is: what is the best data structure to represent monetary values?
A few years ago, if a candidate said "not floats", that would've been enough.
But I don't think that's a sufficiently valid answer anymore.
I'm Alvaro Duran, and this is The Payments Engineer Playbook. You've probably watched tons of tutorials online on how to build software that moves money around. That helped you get the job. But now you're tasked with maintaining and scaling one of those systems, with down-to-the-penny exactness, and you're starved for resources on how to do it.
You're not alone. It happened to me as well.
Over the last ten years, I've worked as a software engineer in fintech companies of all sizes, and I've seen what worked and what didn't. I've been part of many conversations about the best tools, the best resources, the best patterns. But these conversations have always been behind closed doors.
And I thought "you know what? It's time we have these conversations in the open".
In The Payments Engineer Playbook, we investigate the technology that transfers money. All to help you become a smarter, more skillful and more successful payments engineer. And we do that by cutting off one sliver of it and extracting tactics from it.
Today's article is about the best data structure to represent money. You may have opinions, but the reality is that only one data structure is consistently good and leaves no room for errors. Literally.
This article focuses on:
Why money isn't part of your business logic
The two most common approaches to represent monetary values, from companies like Stripe, Modern Treasury, Moov Financial, and TigerBeetle (they don't agree on this topic)
Which one is best; facts, not opinions
Enough intro, let's dive in.
Money Is Part Of Your Universe
The trickiest part about your domain logic is that it must represent both the business and its universe.
There's the well-known "business" layer that every speaker at DDD Europe never gets tired of talking about. That's software representing what you want your product to do, irrespective of the fact that it's software.
But there's also a more subtle "universe" or "environment" layer, that constrains what your product can do, ruling out a few of those possibilities as absurd.
A customer may not be allowed to return a purchase after 30 days, but customers must not be able to return something they haven't bought.
I hesitate to call the universe a "layer". It's not like universe is stacked below business, but above I/O. The constraints of the universe in which a software system exists permeate all the logic in the codebase.
That's not unlike security, after all. In some sense, the security of your software is about making sure that the logic of the universe is applied consistently.
This universe layer is globally coupled. Which sounds bad, because globally coupled code is difficult to change1. But that is the whole point: the universe layer is meant to be the bedrock of the domain logic.
And, of course, in money software, how money gets represented belongs to the universe layer.
Which is why nailing this part of the software is crucial, and so difficult to change later.
Half Cents Floating Around
Aside from floats, there are at least two ways in which you can represent monetary amounts.
One is called "decimals as amounts". You're probably using this one.
Programming languages don't have money as first class data type, but they have something close enough: the `decimal` type, which in Rust, Go (not built-in), Java, Ruby and Python represents a type that "works in the same way as the arithmetic that people learn at school.2"
You know, the one where 0.1 + 0.1 + 0.1 - 0.3
is exactly 0, and not 5.5511151231257827e-017
.
Moov Financial deliberately chose this way to represent monetary amounts in its codebase.
The decimal library gives us some useful tools right out of the box, like performing basic arithmetic on values with different decimal points, for example, 0.001 + 0.0001.
That's the power of decimals. To preserve significance, they do not truncate trailing zeros. Your global configuration is meant to specify the precision of the amount, and the rounding rules to be applied. Which sounds like a good thing.
But it's not.
Moov Financial implemented the decimal-as-amounts approach as a refactor from using integers as USD cents, implicitly assuming that 100 cents was a full dollar, and an ISO 4217 string representing the currency code.
This design, however, creates an issue if we change the currency to another one that isn’t based on two decimal places, i.e., we have no way of swapping out how cents are represented.
Yes, there are currencies like the Kuwaiti, Bahraini, Tunisian dinars, and the Omani rial, subdivided into 1,000 fils, fils, milims, and baisa, respectively3. But there are other, more crucial issues with assuming 100 cents per full currency:
Gasoline is priced using 4 decimal places (why is that the case? 1930s prices, and the Depression era where 1 cent increase was a huge deal). So are foreign exchanges, utility bills, and goods sold wholesale.
Transaction fees, which are the bane of any trader's existence.
Cryptocurrencies have extremely low subunits, like Bitcoin (divided into 100 million satoshis), but are significant due to their (currently) high exchange rate to USD.
Moving all this rounding into the environment is a really bad idea.
For example, the TAF fees. Every time you sell your stocks in the US market, the SEC and FINRA charge fees on the sale. Technically, they're applied at the end of the day. But many, many investment platforms simply can't do that, because they can't deal with sub-cent amounts, and are forced to round the amount per transaction.
This is not a bug: rounding up to the nearest cent, times millions of transactions per day, is a lot of money (though some platforms like Alpaca Markets have already moved to "daily consolidation").
Decimals help with this problem, which is why many engineers choose decimals to represent money.
The problem, though, is that sub-cent amounts do not exist.
Sub-cents Aren't Part Of Your Universe
Amounts that are more precise than the indivisible unit of a currency don't really exist.
And yet, when you use decimals-as-amounts, you're allowing them into your universe. When you do that, you're introducing the very real chance that some bug may slip into your system.
Watch out for one of your IT chads driving a Ferrari to work, like Richard Pryor in Superman III.
Half cents floating around? Not on my watch.
That's why you should use integers as cents.
The trick is simple, and it's used by Stripe, Monzo, Modern Treasury, and TigerBeetle. Money, in the integers-as-cents approach, is a data structure that consists of 3 variables:
The integer, which represents the amount of indivisible units
The exponent, or asset scale, which represents how many indivisible units make up a full currency (USDs are divided into 100 cents, so the exponent is typically 100)
The ISO 4217 currency code
The problem with this approach is that it is confusing for the client. So, you're adding the extra job of having to represent the amount in a way that makes sense to them, or spending enough time and documentation to make sure they understand.
It’s probably why you didn’t think of it in the first place.
Here's the thing: integers-as-cents doesn't allow accidental sub-cent amounts. You can introduce exceptions (by tweaking the exponent and making the process of adding up explicit), but if you provide a set of defaults for the currencies you support, there is no way that a naive engineer will introduce a rounding mistake that could cause damage.
In other words, integers-as-cents is closer to the principle of Making Illegal States Unrepresentable.
The MISU principle
Integers-as-cents is an example of what Hynek Schlawack called Design Pressure: the data structure that makes the problem solve itself.
This is the underlying idea behind Making Illegal States Unrepresentable. You use the data structures that make it impossible to create entities in your domain that cannot exist.
Customers must not be able to return something they haven't bought, and your data structure should prove that the item was bought by the client before being able to call refund on it.
The key idea is that you do that, not by having an extra layer of code on top, or below your business logic. You do that by choosing the appropriate types as your function parameters.
MISU then gets checked every time your codebase gets compiled. A compile error is a place in your code where the MISU principle gets broken.
This principle can be applied everywhere, not just money. Alexis King has a great intro on it called Parse, Don't Validate.
Integers as cents is a better approach because decimals as amounts violate the MISU principle. Otherwise, Richard Pryor will steal a lot of half cents from you.
And will buy a Ferrari.
That’s it for this article of The Payments Engineer Playbook. See you next week.
Coupled code is prone to change amplification: if you change it somewhere, you have to change it everywhere.
There are also the Mauritanian ouguiya, which is divided into 5 khoums; and the Malagasy ariary, divided into 5 iraimbilanja. But I've never seen any payments system even consider them.
Great post on how to not mess up your ledger!
Even though it's mentioned implicitly, part of this has been a problem before computers. It's an intrinsic arithmetic problem.
You can see how you need to allocate in "arbitrary fashion" when paying installments for a loan.
Regardless, Banker's rounding is a fun topic on itself.
https://en.wikipedia.org/wiki/Rounding#Round_half_to_even
Keep these posts coming!