Why Has Figma Reinvented the Wheel with PostgreSQL?
The data revealed that some of our tables, containing several terabytes and billions of rows, were becoming too large for a single database. At this size, we began to see reliability impact during Postgres vacuums, which are essential background operations that keep Postgres from running out of transaction IDs and breaking down. Our highest write tables were growing so quickly that we would soon exceed the maximum IO operations per second (IOPS) supported by Amazon’s Relational Database Service (RDS). Vertical partitioning couldn’t save us here because the smallest unit of partitioning is a single table. To keep our databases from toppling, we needed a bigger lever.
[…]
Horizontal sharding was an order of magnitude more complex than our previous scaling efforts. When a table is split across multiple physical databases, we lose many of the reliability and consistency properties that we take for granted in ACID SQL databases.
[…]
We built a DBProxy service that intercepts SQL queries generated by our application layer, and dynamically routes queries to various Postgres databases. DBProxy includes a query engine capable of parsing and executing complex horizontally sharded queries. DBProxy also allowed us to implement features like dynamic load-shedding and request hedging.
[…]
We avoided having to implement “filtered logical replication” (where only a subset of data is copied to each shard). Instead, we copied over the entire dataset and then only allowed reads/writes to the subset of data belonging to a given shard.
Denis Magda (via Hacker News):
Figma doesn’t use the open-source distribution of PostgreSQL. Instead, they utilize PostgreSQL as a service by subscribing to Amazon RDS. There’s an interesting, often overlooked fact about PostgreSQL managed services provided by large cloud providers and smaller vendors. While these services usually offer all the core PostgreSQL capabilities, the list of supported extensions is at the mercy of the service provider.
Now, we have CitusData, a mature PostgreSQL extension for sharding, and we know that Figma uses RDS, a fully-managed PostgreSQL service by Amazon. However, if you check the list of PostgreSQL extensions supported by RDS, CitusData isn’t included[…]
So, now, let me speculate. The real reason why Figma reinvented the wheel by creating their own custom solution for sharding might be as straightforward as this — Figma wanted to stay on RDS, and since Amazon had decided not to support the CitusData extension in the past, the Figma team had no choice but to develop their own sharding solution from scratch.