5 Common Antipatterns in Payment Systems Design
They're sensible approaches that end up becoming major obstacles for scale and maintenance down the road.
Good payment engineers don’t focus on performance or clean code. They focus on the business logic.
So, if your payment system is buggy or too complex to modify, it’s probably not the code itself—it’s a reckless architecture.
In this article, we’re going to discuss the 5 Antipatterns I’ve seen happen over and over in payment systems, put in place by well-intentioned, but naive engineers.
This is The Payments Engineers Playbook. Let’s dive in.
Antipattern #1: Dealing with Sync-only responses from PSPs
I made the point last week, but it’s worth making it again: PSPs are in the business of making sure you get their communications, not in the business of handling them for you.
Every time a message is sent over the network, the PSP system must face an impossible problem.
On the one hand, it must ensure that no message is lost. An incomplete communication almost always lead to a failed payment, one that may very well be a perfectly valid purchase from a perfectly valid customer who won’t try again.
On the other hand, it has no way to confirm that the message was received. This is known as the Byzantine generals problem, and it’s the reason why after a TCP handshake, both parties have acknowledged the connection, but none are certain.
PSPs can only increase the likelihood that the message was received, and they do that by sending it multiple times.
Payment systems often assume that the outcome of an authentication, capture, or refund will be straight in the response sent by the PSP.
Well, yeah, usually. But not always.
Sometimes, the PSP will send an event to your callback url anyways. Just in case.
Think about it: most providers’ systems are distributed to achieve high availability. That’s a good thing. But it comes with tradeoffs. Such as nondeterminism.
That means that two services inside their complex web may have decided to communicate the outcome of a payment to you, independent of each other.
And you end up getting two confirmations for the same payment.
If you’re not careful, you may process these two messages as a double charge.
So, if you’re assuming synchronous responses, make sure each payment is locked for changes at retrieval (e.g. SELECT FOR UPDATE) to reject the possibility of two processes running in parallel, wreaking havoc on your system.
Antipattern #2: Believing that a payment is the movement of money
Accepting payments doesn’t mean being responsible for the actual movement of money.
If you went to Starbucks this morning to order a delicious Mocha Cookie Crumble Frappuccino (size Venti), you may have paid with a credit card. When you swapped it on their POS terminal, you may have keyed in your secret PIN, and the terminal ended up showing a big green tick, or the equivalent of a “payment successful” message.
Believe me when I tell you that no money changed hands at that moment.
When you see the green tick, your bank has agreed to eventually send the funds to the cafe’s bank, because it has made sure it can get those funds from you.
A payment is related to a money movement. But it isn’t one. A payment is a promise.
More precisely, a payment is a promise made by an authorized party about a money movement.
This is why people held strange beliefs such as that payments, once completed, are final, or that Bitcoin is a payment method.
The movement of money is called Transfer. Payments are a wrapper around a Transfer, so to say. A contract, though very standard and very very frequent.
Most payment systems, however, pretend that Payments are actual movements of money, with amount, source and destination.
No wonder they end up bundling payment objects with the items paid for (what I call Orders) and the way the payer paid for those items (what I called Methods).
If you’re about to make that mistake, take a step back.
If you already made it, it’s time to go back to the whiteboard and devise a plan to untangle the mess before your life gets more difficult down the line.
Antipattern #3: Building on top of a Card-first design
If you only accept credit cards at checkout, you’re losing customers.
But if you think that accepting more payment methods will only require sprinkling just a little bit of code on top of your card-only payment system, you’re in for a surprise.
Stripe was one of the first companies that realized this:
We built support for new payment methods on top of a set of abstractions that were designed for the simplest payment method of them all: cards. Naturally, abstractions designed for cards were not going to be great at representing these more complex payment flows. [...] It’s as if we were trying to build a spaceship by adding parts to a car until it had the functionality of a spaceship
— Michelle Bu, Stripe’s payments APIs: The first 10 years
I’ve covered the biggest findings of Stripe’s redesign on a few articles already, but best if you start here:
Stripe’s engineers ended up noticing that the transaction-specific data is in a one-to-many relationship with how the payment is processed, and created two objects, PaymentIntent and PaymentMethod, to make that relation explicit.
The what has many hows, so to say.
Best if you separate at least 3 objects in every payment:
The Order, which contains the items being purchased, including things like discounts, taxes, and every item at checkout in a consistent manner.
The Intent, which tracks the “what” is paid, specifying amount, currency, and the status of the payment as a whole.
The Method, which contains the interaction with the payer (what you probably call “token”), with its own status and payer fraud data.
Each one has its own lifecycle, but I’m getting ahead of myself. Which brings me to the next antipattern.
Antipattern #4: Stretching State Machines
Yeah, I know. Everyone agrees that state machines are The Way to do things.
But maybe the issue is that no one builds correct state machines in the first place?
Like, without loops?
A cycle in a state machine may well be two independent state machines.
Multiple baseball players step onto the same field. Multiple scientific hypothesis are discarded before a valid theory is found. And the same payment can be attempted multiple times until it goes through.
The key is that each player, each hypothesis and each attempt are not the same. In each case, a state machine that looks cyclical can be seen, as an improved alternative, as multiple state machines transitioning from one state to another, without ever reaching the same step again.
Just like that quote from Heraclitus: not the same river, not the same man.
— Payment State Machines Are Not Cyclical
I’m not surprised that Stripe’s API redesign was finally given green light once they realized that they forced fed circular state machines on their users.
You and me, however, don’t have the luxury of redoing things. We have to get them right the first time.
Noticing this antipattern only reinforces what I said earlier: you may be using a single Payment object with a circular state machine, when a much cleaner design with Intent and multiple Methods works better.
If you go the Intent-Methods route, you’ll quickly realize that you’re doing away with circular state machines.
Antipattern #5: Sticking to Early Database Decisions
Most of us are fine running on top of a postgreSQL database.
But maybe, at a particular scale, that’s not the right choice anymore. What if, in the future, you consider using a finance-specific database like TigerBeetle?
What if you end up growing like crazy, and your database is stretch beyond what it can handle?
I’ve talked a lot about this problem in payments.
Look, this is the kind of advice that I think it’s worth add some nuance: as things are, currently, you’re probably OK.
And it’s a hell of a lot easier to say that You Are Not Gonna Need It, and leave things the way they are.
But if you’re aiming higher than where you’re now in terms of scale, or better yet, if you’re growing fast and the pace is accelerating, you will end up rewriting your architecture to accommodate that growth eventually.
Why not trying to extend the usable life of your architecture as it currently is?
Oh, by the way, if you want to see this particular trick in action, give this note a like.
That’s it
So, the next time you sit down to code, ask yourself:
Is there an asynchronous case I’m not considering?
Am I bundling Order, Intent and Method?
Am I using the wrong abstraction to process payments at this stage?
Am I using a circular state machine?
Are the data systems I’m using the right choice for this particular scale?
You’ll start writing more robust payment systems—with less bugs and simpler code.
This has been The Payments Engineers Playbook. I’ll see you next week.
P. S.: Don’t forget to share this article with someone on your team!