SmartDataMark logo

Partition Pruning in Snowflake (a.k.a. micro-partition elimination)

What it is

Snowflake automatically splits tables into micro-partitions (~50–500MB uncompressed) and stores rich metadata per partition (min/max per column, null counts, etc.). When your query has selective filters, Snowflake can skip entire micro-partitions whose metadata shows they can’t match—this is partition pruning.

Less data scanned → faster queries → lower cost.

Natural clustering vs. CLUSTER BY

  • By default, data lands with a natural clustering pattern (based on ingest order).
  • If your filters/join keys align with that natural order, pruning will be effective.
  • If not, you can define a CLUSTER BY key to improve data locality and pruning on specific columns.
  • How it works

  • Micro-partitions & metadata are created on load/ingest.
  • Query analysis: Snowflake matches your predicates (and some join conditions) against min/max ranges in metadata.
  • Pruning: Partitions whose ranges cannot satisfy the predicate are entirely skipped.
  • Benefits

  • Performance: fewer partitions scanned → faster queries.
  • Cost: less compute to process irrelevant data.
  • Scale: effective even with millions of micro-partitions.
  • Pro tips

  • Put selective predicates on columns that are naturally (or explicitly) clustered.
  • Consider CLUSTER BY (order_date) (or your main filter keys) for long-lived, large tables with predictable filters.
  • Use Query Profile to inspect partitions scanned vs. pruned.
  • Check clustering quality periodically (e.g., with SYSTEM$CLUSTERING_INFORMATION) and re-cluster when it degrades.
  • TL;DR: Good predicates + good clustering =