build-your-own SUMMARIZE with columns(*), unpivot, and struct.*SQL
Editor's note: Although SUMMARIZE() is fully supported in DuckDB, Hamilton shows you how to build this capability yourself using more primitive functions.
Execute this SQL
-- C = number of columns in TBL, S = number of aggregates -- ------------------------------------------------------ -- (1) use columns(*) to produce S aggregates -- for each column as a single struct, -- producing 1 struct row and C columns with aggregate as ( from sf311 select { name: first(alias(columns(*))), type: first(typeof(columns(*))), max: max(columns(*))::varchar, min: min(columns(*))::varchar, approx_unique: approx_count_distinct(columns(*)), nulls: count(*) - count(columns(*)) } ), -- (2) unpivot to get a result with C rows and 1 column of structs columns as (unpivot aggregate on columns(*)) -- (3) explode the struct to get a result with C rows and S columns select value.* from columns;
Copy code
Hamilton Ulmer
Expand
Share link
Execute this SQL
-- First, transform TBL (or any relation) -- into a single column of structs. with structs (from TBL select TBL) -- Then pack those structs into a list. from structs select list(TBL) as ready_to_plot;
Copy code
Hamilton Ulmer
Copy code
Expand
Share link
Execute this SQL
with bits as ( select -- add 8 bits to the end to account for the delimiter bit_length(columns(*)) + 8 from <TABLE> ), -- aggregate all columns bits_agg as ( select sum(columns(*)) from bits ), -- unpivot a wide single row of aggs to single column bits_col as ( unpivot bits_agg on columns(*) ) -- add them all up & convert to mb select sum(value) / (8 * 1024 ** 2) as mb from bits_col
Copy code
Hamilton Ulmer
Copy code
Expand
Share link