Skip to main content

Request coalescing

When a dashboard refreshes, ten users open the same report at 9am, or a CI matrix fans out identical dbt queries, Snowflake sees N copies of the same query — each one a candidate to resume a warehouse and burn credits. chukei collapses them into one execution and fans the single result back out.

One execution, many waiters

time --->

Client A --Q(F)-->+
│ chukei runs Q once -> Snowflake
Client B --Q(F)-->+ (B and C join the in-flight request,
│ so the warehouse runs Q just once)
Client C --Q(F)-->+

+ <------- single result for Q

+ --> result to A
+ --> result to B (no extra cost)
+ --> result to C (no extra cost)

Coalescing keys on the hard fingerprint: while a query with fingerprint F is in flight to Snowflake, any further request with the same F attaches to the pending execution instead of issuing its own. When the upstream result returns, every waiter is served from it.

Coalescing vs caching

They are complementary, and both subject to the determinism gate:

CachingCoalescing
Targetsrepeated queries over timeidentical queries at the same instant
Savesthe whole executionthe duplicate concurrent executions
Windowuntil invalidationthe in-flight duration only
Needs a prior result?yesno — the first one is still running

Coalescing captures the savings that caching can't: the first time a herd of identical queries arrives, there is no cache entry yet, but there is still only one real query worth running. This is a major lever for BI dashboard costs, where many viewers hit the same tiles simultaneously.

Safety

Only determinism-gate-clean reads are coalesced — writes and non-deterministic queries each execute independently, and if anything is uncertain the requests fail open and run separately. Coalescing never changes a result; it only removes redundant executions of an identical one.