35 Comments
Aug 7Liked by Alvaro Duran

Found this article through hacker news and followed instantly since this talks about a field I’ve always been fearful to engage in. Handling payments sounds terrifying to gave to deal with both legacy software and behemoth institutions when money is so visibly on the line. This article helped taper my fears though. If we design with failure as an expectation, we can fail more gracefully. There is no need to fear because there is no risk. The stakes are only high if a failure has no backup plan.

Expand full comment
author

Hi Nate, I appreciate your comment, especially the last sentence. It captures the zeitgeist of what I'm going for with this post beautifully!

I believe this post resonated with a lot of people precisely because of its emotional component. We're just afraid to test stuff in prod. It's trite, but the solution is precisely leaning into that fear.

Expand full comment
Aug 7Liked by Alvaro Duran

Loved it. Especially the emphasis on resilience vs attempting to "never fail".

Expand full comment

I’m not in payments and yet, after reading this brilliant article, I subscribed to your channel. You touch on bedrock engineering principles which transcend your vital niche, methodically demonstrating how those principles apply to payments, but also hinting at how they apply throughout well-engineered systems. Well done.

Expand full comment
author

Stephen, this made my day. Thank you so much for your kind words, I'll do my best to live up to the expectations in future articles :D

Expand full comment
Aug 7Liked by Alvaro Duran

This is one of the better eng articles I’ve read lately. Keep it up!

Expand full comment
author

Hope you stay tuned for more!

Expand full comment
Aug 12Liked by Alvaro Duran

It is an amazing article, Thanks

Expand full comment
Aug 9Liked by Alvaro Duran

A colleague shared this article with me and I loved reading it. Your work on writing it was much appreciated. Thanks for that!

Expand full comment

Great article! Super insightful! Keep it going!

Expand full comment
Aug 8Liked by Alvaro Duran

Great article, super informative!

Expand full comment
Aug 8Liked by Alvaro Duran

We have a similar situation at my work. Our system handles card processing. We can gather real-time data and use it in a staging environment but nothing, I mean nothing, is the same as real-time. We found a bug and have stressed over how best to test the fix. I keep saying we just need to release it and watch the results, roll back quickly if it fails. Your article was very timely. Thank you!

Expand full comment
author

Good luck and Have Fun with your roll out!

Expand full comment
Aug 8Liked by Alvaro Duran

super interesting

Expand full comment
Aug 8Liked by Alvaro Duran

Love your work. I think Charity Majors is a legend but the impact of you, from your position within Uber making these points is huge. Even more so given you're speaking about payments, which is where s**t get's real. Never underestimate that impact.

I was pivoting to public cloud back when Netflix was scaling. People like Adrian Cockroft talking about their very real problems and unique solutions has been instrumental to the way I think about resilient, scalable distributed systems. You continue that fine tradition & it's awesome. Thanks :)

Expand full comment
author

Hey Andrew thanks for this! Watch out for the next week's post on Airbnb, I'm sure you'll like it :)

Expand full comment
Aug 8Liked by Alvaro Duran

This was a great read, subscribed!

Expand full comment
Aug 7Liked by Alvaro Duran

Awesome article. It is helpful and it increases my curiosity around getting that playbook.

Expand full comment
Aug 7Liked by Alvaro Duran

This is total vindication for all the times I argued against spending vast engineering effort on the 'perfect' staging system. This approach makes more sense, especially when dealing with 3rd party APIs which often have garbage dev/staging environments.

Expand full comment
author

I would add some nuance to what I said about third party sandboxes. They're not going to improve, but that's because they're built so that you can do API integration testing.

Which is a great first step, and much needed. But there's more to it than just making sure that schemas conform to the API.

Expand full comment
Aug 7Liked by Alvaro Duran

I've worked on financial transaction systems that are way lower traffic than Uber, and it was still essential to test in production. It almost seems self-evident - the more critical the software is, the more important it is to test the final, live product properly.

I think the message of how your operations supports this is the real takeaway here.

Thanks for writing!

Expand full comment
author

I couldn't agree more Dave. And yet, testing in prod makes most people cringe. What do you think is the reason?

Expand full comment
Aug 7Liked by Alvaro Duran

I've worked in payments (not in this depth, but similarly, for an online game) about 15 years ago. Now I am working on developing a college course about the difference between prototyping and production. Thank you so much for providing such a useful reference on this topic as it pertains to payments!

Expand full comment
author

Let me shamelessly plug my recent talk on the topic of prototype vs production: https://news.alvaroduran.com/p/enterprise-python

Hope it's useful!

Expand full comment