Intellectually Curious

Multigres: A Scalable Operating System for Postgres

Mike Breault

Use Left/Right to seek, Home/End to jump to start or end. Hold shift to jump forward or backward.

0:00 | 6:29

Multigres is an open-source project designed to provide Vitess-grade scalability and high availability for Postgres databases. Recently released in its v0.1 alpha stage, it functions as a comprehensive management system that handles connection pooling, automatic failovers, and backup orchestration. The platform utilizes a specialized Kubernetes operator to simplify cluster deployment and uses a unique consensus protocol to ensure data integrity during hardware failures. Its sophisticated architecture includes a two-service pooling solution that transparently routes traffic while maintaining connection state without requiring manual mode selection.


Note:  This podcast was AI-generated, and sometimes AI can make mistakes.  Please double-check any critical information.

Sponsored by Embersilk LLC

SPEAKER_00

So I was hosting this big family dinner last weekend, right? And I had this grand illusion that I could roast a chicken, whisk together a complex reduction sauce, and you know, actively entertain my cousins all at the exact same time.

SPEAKER_01

Uh yeah, that sounds like a recipe for definitely dropping something.

SPEAKER_00

Oh, it absolutely was. I totally dropped an entire tray of roasted vegetables on the floor while I was trying to pour wine. You can really only spin so many plates before one just inevitably shatters.

SPEAKER_01

Right, exactly. There's a hard limit to human bandwidth.

SPEAKER_00

Totally. Well, anyway, welcome to Intellectually Curious. We are so glad you are here with us today for another highly optimistic deep dive. And today's mission is all about exploring a tool that basically stops developers from dropping plates.

SPEAKER_01

Yeah, and it's a really exciting one. We are looking at multigres version 0.1 alpha today.

SPEAKER_00

Right, which is being pitched as this sort of operating system for Postgres. And honestly, the phrase that really caught my eye in the docs was VIT for Postgres.

SPEAKER_01

I mean, it's a bold claim, for sure, but it makes a lot of sense when you look at the core problem. You know, Postgres is this incredibly mature, reliable database, but once your traffic hits a certain scale, managing it stops being about the database itself.

SPEAKER_00

Aaron Powell Right, because you aren't just writing SQL anymore at that point. You're suddenly manually managing read replicas and failovers and all that stuff.

SPEAKER_01

Aaron Powell Exactly. And connection limits. It becomes a massive operational chore, just tons of manual duct tape. But Multigrays, which just released this open source alpha, by the way, and I think a superbase version is actually coming soon. It uses a Kubernetes operator to just orchestrate all of that.

SPEAKER_00

Aaron Powell So wait, is this kind of like hiring a dedicated kitchen manager? Like so the head chef can finally just focus on actually cooking the food instead of um organizing the pantry.

SPEAKER_01

Aaron Powell That is a great way to look at it. Yeah. Instead of a DevOps team constantly scrambling to manually reroute traffic when a node goes down, multigrids natively handles the logistics. So the engine just processes data.

SPEAKER_00

Aaron Powell Okay. But a kitchen manager is great until the power goes out. Yeah. So how does multigres keep the database alive when things actually break?

SPEAKER_01

Aaron Powell Well, so it treats high availability or HA as a consensus problem. But the really cool thing is it allows for user-defined durability policies.

SPEAKER_00

Aaron Powell Oh, interesting. Like what kind of policies?

SPEAKER_01

Aaron Powell So you aren't boxed into strict majority quorums like you are with traditional distributed systems. You can just configure it to say, hey, I just need my data to survive a single availability zone failure, and it optimizes for exactly that policy.

SPEAKER_00

Wow. Okay.

SPEAKER_01

And the most impressive part, it does all of this using standard unmodified Postgres replication.

SPEAKER_00

Let me push back on that for a second. Unmodified. Because standard Postgres doesn't have native distributed consensus protocols built in like Raft or Paxos, right?

SPEAKER_01

Right. No, you are spot on. It doesn't have that natively.

SPEAKER_00

So if it's not modifying the core engine, how is it suddenly resolving complex HA failures then?

SPEAKER_01

Ah. So that is where the architecture really shines. Multigres intelligently layers generalized consensus protocols on top of the database. It basically builds a management layer that watches the replication state from the outside.

SPEAKER_00

Oh, I get it. So if there is a split brain scenario where two nodes think they are the primary, this external layer just steps in.

SPEAKER_01

Exactly. It resolves it by routing traffic and reconciling commits. So you get consensus without needing a proprietary rewritten version of Postgres.

SPEAKER_00

That makes total sense. You're getting the distributed logic without sacrificing the pure core. Now, keeping the state safe is only half the battle, right? But managing the flood of customer traffic is the other half. And actually, speaking of managing tech complexity, I need to mention today's sponsor real quick.

SPEAKER_01

Oh, right, Embersilk.

SPEAKER_00

Yeah. This podcast is sponsored by Embersilk. If you need help with AI training or automation integration or software development, they are fantastic. If you're trying to uncover where agents can make the most impact for your business or even your personal life, definitely check out EmberSilk.com for your AI needs.

SPEAKER_01

Yeah, they do really great work.

SPEAKER_00

So jumping back to the traffic flood, how does multigraze handle connection pooling? Because usually, you know, you just slap PG Bouncer on there and just kind of hope for the best.

SPEAKER_01

Right. Well, multigraze takes a much more integrated approach. It uses a two-service pooling architecture. There is a multi-gateway that routes incoming queries and a multi-pooler for the back end. And the real secret sauce here is this context-aware pooling.

SPEAKER_00

Context aware. Wait, so it's actually parsing the SQL queries. It's not just blindly forwarding network packets.

SPEAKER_01

Yes, exactly. It parses the queries to understand the intent. So if a request needs a stateful connection, like it's opening a transaction, the gateway seamlessly pins that specific connection to the backend.

SPEAKER_00

Oh, that is so smart.

SPEAKER_01

And it goes a step further by actually deduplicating prepared statements across gateways, which massively reduces overhead on the database.

SPEAKER_00

Okay, so it is basically like a brilliant maitrae, like someone who knows exactly what the customers are going to order before they sit down and just groups them perfectly at the right tables.

SPEAKER_01

I love that analogy. Yes, it's exactly like that. And once your traffic is flowing perfectly, you need an insurance policy. So multigres tightly integrates PG Backrest for automated backups and bootstrapping.

SPEAKER_00

Which means basically no human intervention if you have to bring up a new cluster, right?

SPEAKER_01

Trevor Burrus, Jr. Exactly. It just identifies a primary and runs the backup. Now, it is important to note that version 0.1 is currently single shard, but horizontal sharding is on the horizon. And that is really the flagship feature here.

SPEAKER_00

Aaron Powell Right. And that is exactly where the Vytes for Postgres comparison comes in. Horizontal sharding means taking a massive database and transparently splitting it across servers.

SPEAKER_01

Precisely. And the application still queries it like it's just one giant infinite database. Once sharding drops, the capacity for scale becomes practically limitless.

SPEAKER_00

Aaron Powell Which is just incredible. And actually, that leads me to a thought I want to leave you with today. You know, if tools like multigrays can completely automate database state, routing, and eventually sharding, imagine what other digital bottlenecks are about to just disappear entirely.

SPEAKER_01

It is a really, really exciting time to be building software for sure.

SPEAKER_00

It really is. The future of tech is just so bright. And we are finally freeing human creativity from all this manual maintenance. Hey, if you enjoyed this podcast, please subscribe to the show, leave us a five star review if you can. It really does help get the word out. Thanks for tuning in.

SPEAKER_01

Thanks for hanging out with us, everyone.

SPEAKER_00

We will catch you on the next deep dive.