The use case is local user DuckDB talking to MotherDuck for $.
This is not commercially a terrible idea. Why keep paying Snowflake for bog-standard SQL query workload when SF makes it easy to migrate to Iceberg & commodity engines like MotherDuck?
Hello, DuckDB DevRel here. Quack is independent from MotherDuck. MotherDuck has its own proprietary protocol, which has been around for years and it supports things like dual execution – see more here:
Sure! Not knocking the architecture: Building out peer-to-peer federation in place of client/server makes perfect sense for DuckDB. And I’m a big fan of owning the protocol so you can optimize it to internal structures.
Just making the point that DuckDB is disruptive technology & what it’s most likely to disrupt.
uh, doing analytics type queries on large datasets that postgres would choke on, as an RPC? I'm using it (ducklake specifically) to build a lakehouse RPC server that can scale horizontally based on resource utilization in k8s.
Our data pipeline produces .duckdb files that our app downloads (it watches the asset in S3 and pulls when etag changes). Makes it easy to get BQ/Clickhouse like performance without running or paying for that infrastructure. Not perfect for all cases, but it handles a lot more than you would expect.
I think that Quack will become the primary option for a DuckLake catalog in the future, for several reasons. To list a few:
1. No type mismatches for inlining. If you use a non-DuckDB catalog, many types do not have a 1:1 mapping, which introduces additional overhead when operating on those data types.
2. You get the raw performance of DuckDB analytics (and now transactions) over the catalog. DuckDB reading DuckDB is simply faster than any of our Postgres/SQLite scanners.
3. No round-trip for retries. We can easily(tm) run the full retry logic on the DuckDB server side. Right now, these retries trigger multiple round trips for Postgres, making it a performance bottleneck for high-contention workloads.
Does this mean I can finally connect to a ducklake instnace hosted remotely? i.e. DuckLake is writing to disk on the remote server and my client is just a client.
Because rn even with Postgres as a catalog I need access to the underlying storage to use Ducklake.
With ducklake this scales well to multi-terabyte data sets. The big benefit of this server protocol is sharing a high memory server and taking advantage of a shared cache for recent data.
> It would be rather misguided not to build a database protocol on top of HTTP in 2026
This is wrong, HTTP is bad for transferring large amount of data and it is also bad for doing streaming.
It is bad for large amount of data because you have timeout issues on some clients, you hit request/response size limits etc.
It is obviously bad for streaming as there is no concept of streaming in it.
It is comical to go the path of least resistance so lazy people can put a reverse proxy on top of it. And then say HTTP is the only relevant way to do it in 2026.
The benchmark doesn't seem to mean much as TCP can max out 50GB/s on a single thread. Pretty sure it can do more than that even. So you could be using anything that isn't terrible and you should get max performance out of this.
Also the protocol is something else from the format. For example if you are transferring mp4 over ftp and http you can compare that.
If you are transferring different things over different protocols then the comparison means nothing.
The benchmark graph for bulk transfer should show more granularity so it is possible to understand how much of the % of the hardware limit it is reaching. Similar to how BLAS GEMM routines are benchmarked based on the % of theoretical max flops of the hardware.
> 60 million rows (76 GB in CSV format!)
This reads a bit disingenuous.
It is dissappointing to see this instead of something like PostgreSQL protocol with support for a columnar format.
I can't think of many use cases for this and Arrow Flight, other than moving data around.
This is not commercially a terrible idea. Why keep paying Snowflake for bog-standard SQL query workload when SF makes it easy to migrate to Iceberg & commodity engines like MotherDuck?
https://duckdb.org/quack/faq#what-is-the-relationship-betwee...
Of course, in the future MotherDuck can also support Quack, but this is not the only interesting use case for Quack.
Just making the point that DuckDB is disruptive technology & what it’s most likely to disrupt.
> Not yet, but we are working on it!
Seems like a niche use case, but it's the one I'm most interested in.
Our lakehouse uses ducklake with postgres as the catalog. Seems like a DuckDB / Quack catalog would be an excellent alternative.
1. No type mismatches for inlining. If you use a non-DuckDB catalog, many types do not have a 1:1 mapping, which introduces additional overhead when operating on those data types.
2. You get the raw performance of DuckDB analytics (and now transactions) over the catalog. DuckDB reading DuckDB is simply faster than any of our Postgres/SQLite scanners.
3. No round-trip for retries. We can easily(tm) run the full retry logic on the DuckDB server side. Right now, these retries trigger multiple round trips for Postgres, making it a performance bottleneck for high-contention workloads.
Disclaimer: I'm a duckdb/ducklake developer.
So you'll be able to test it in a few days.
Because rn even with Postgres as a catalog I need access to the underlying storage to use Ducklake.
I can definitely see exploring this for some homelab use.
This is wrong, HTTP is bad for transferring large amount of data and it is also bad for doing streaming.
It is bad for large amount of data because you have timeout issues on some clients, you hit request/response size limits etc.
It is obviously bad for streaming as there is no concept of streaming in it.
It is comical to go the path of least resistance so lazy people can put a reverse proxy on top of it. And then say HTTP is the only relevant way to do it in 2026.
The benchmark doesn't seem to mean much as TCP can max out 50GB/s on a single thread. Pretty sure it can do more than that even. So you could be using anything that isn't terrible and you should get max performance out of this.
Also the protocol is something else from the format. For example if you are transferring mp4 over ftp and http you can compare that.
If you are transferring different things over different protocols then the comparison means nothing.
The benchmark graph for bulk transfer should show more granularity so it is possible to understand how much of the % of the hardware limit it is reaching. Similar to how BLAS GEMM routines are benchmarked based on the % of theoretical max flops of the hardware.
> 60 million rows (76 GB in CSV format!)
This reads a bit disingenuous.
It is dissappointing to see this instead of something like PostgreSQL protocol with support for a columnar format.