Snowflake query caching
Snowflake query caching reduces compute cost when repeated reads can reuse previous work. Native Snowflake result caching helps, but dashboard and analytics workloads often miss it because SQL text, session state, bind values, timing, or invalidation rules differ across tools.
chukei adds verified proxy-side query caching for deterministic read workloads. It sits between clients and Snowflake, fingerprints safe reads, invalidates on writes, and samples cache hits against live Snowflake in blame mode.
Where native Snowflake result caching helps
Snowflake's result cache is strongest when the same user or workload repeats the same deterministic SQL shortly after the first run. It is useful, built in, and should stay enabled.
The limits show up with BI traffic:
- dashboards generate near-identical but not always exact SQL;
- different tools and users create different session context;
- query tags and bind values change;
- writes invalidate cached work;
- cache observability is separated from team-level cost attribution.
What verified proxy caching adds
chukei does not blindly trust a cache key. A cacheable query must pass the determinism gate:
- read-only SQL only;
- no non-deterministic functions such as
CURRENT_TIMESTAMP()orRANDOM(); - no chunked large-result responses;
- no table write since the cached result;
- no plugin uncertainty.
When the query is safe, chukei can serve a cached result and record avoided warehouse compute in the savings ledger. Blame mode continuously replays a sample of cache hits against live Snowflake; any mismatch evicts the entry and should be treated as a bug.
Best-fit workloads
| Workload | Fit | Reason |
|---|---|---|
| BI dashboards | High | Repeated reads and parameterized filters often create avoidable compute |
| Reporting jobs | High | Scheduled reports tend to repeat stable query shapes |
| dbt model reads | Medium | Useful when downstream reads repeat; writes still invalidate cache |
| Ad-hoc analysis | Medium | Helps repeated exploration, but misses on genuinely novel SQL |
| Large exports | Low | Chunked result transfers bypass the proxy and are not cached |
How to measure cache value
Start with a QUERY_HISTORY replay before deploying:
chukei replay --query-history queries.csv --output projection.json --evidence
Then compare:
- cache hit count;
- avoided warehouse credits;
- false-positive count, which should stay zero;
- p99 proxy overhead;
- Snowflake warehouse metering before and after deployment.