StackOverflow Analytics
Editor's note: Michael shares Stackoverflow data in MotherDuck as part of this snippet as well as typical aggregate analytics on the data. There are additional sample data sets attached by default in MotherDuck as the 'sample_data' share.
Which 5 questions have the most comments, what is the post title and comment countSQL
SELECT Title, CommentCount FROM posts WHERE PostTypeId = 1 ORDER BY CommentCount DESC LIMIT 5;
Copy code
User reputation and reputation rate per daySQL
SELECT name, reputation, round(reputation/day(today()-createdAt)) as rate, day(today()-createdAt) as days, createdAt FROM users WHERE reputation > 1000000 ORDER BY rate DESC;
Copy code
Reputation rate as bar chart with CTESQL
WITH top_users as ( SELECT name, reputation, round(reputation/day(today()-createdAt)) as rate, day(today()-createdAt) as days, createdAt FROM users WHERE reputation > 1000000 ) SELECT name, reputation, rate, bar(rate,150,300,35) AS bar FROM top_users;
Copy code
Post statistics per yearSQL
SELECT year(CreationDate) AS year, count(*), round(avg(ViewCount)), max(AnswerCount) FROM posts GROUP BY year ORDER BY year DESC LIMIT 10;
Copy code
Posting Frequency with bar chart on Weekdays for "sql" tagSQL
SELECT count(*) as freq, dayname(CreationDate) AS day, bar(freq, 0, 150000,20) AS plot FROM posts WHERE posttypeid = 1 AND tags LIKE '%<sql>%' GROUP BY all ORDER BY freq DESC;
Copy code
Posting Frequency with bar chart on Weekdays for "rust" tagSQL
SELECT count(*) as freq, dayname(CreationDate) AS day, bar(freq, 0, 10000,20) AS plot FROM posts WHERE posttypeid = 1 AND tags LIKE '%<rust>%' GROUP BY all ORDER BY freq DESC;
Copy code
Show less
Michael Hunger
Expand
Share link
Execute this SQL
with bits as ( select -- add 8 bits to the end to account for the delimiter bit_length(columns(*)) + 8 from <TABLE> ), -- aggregate all columns bits_agg as ( select sum(columns(*)) from bits ), -- unpivot a wide single row of aggs to single column bits_col as ( unpivot bits_agg on columns(*) ) -- add them all up & convert to mb select sum(value) / (8 * 1024 ** 2) as mb from bits_col
Copy code
Hamilton Ulmer
Copy code
Expand
Share link
Execute this SQL
CALL load_aws_credentials(); --- duckdb >= v0.9.0, AWS and httpfs extension loaded on the fly
Copy code
Carlo Piovesan
Copy code
Expand
Share link
Execute this SQL
with nested_json_list as ( select '["a", "b", "c"]'::json as foo ) select json(foo)::varchar[] from nested_json_list
Copy code
winnie
Copy code
Expand
Share link
Execute this SQL
-- Here we make use of list_transform and -- the fact that DuckDB knows lambda expressions SELECT list_transform(split('01:08:22', ':'), x -> CAST (x AS INTEGER));
Copy code
Michael Simons
Copy code
Expand
Share link
Execute this SQL
-- * EXCLUDE plus any additional expression does the trick SELECT * EXCLUDE (valid_from), -- valid_from used in computation year(valid_from) AS year -- A new expression FROM prices -- Note that the new expression can be referred to in the WHERE clause WHERE year BETWEEN 2019 AND 2020;
Copy code
Michael Simons
Copy code
Expand
Share link
Execute this SQL
-- bar(x, min, max[, width]) --- Draw a band whose width is proportional to (x - min) and equal to width characters when x = max. width defaults to 80. with data as (select unnest(range(1,10)) as v order by random()) select bar(v,minv,maxv,30) from data, (select min(v) as minv, max(v) as maxv from data); -- ┌────────────────────────────────┐ -- │ bar(v, minv, maxv, 30) │ -- │ varchar │ -- ├────────────────────────────────┤ -- │ ██████████████████████▌ │ -- │ ███▊ │ -- │ ██████████████████████████████ │ -- │ │ -- │ ███████████████ │ -- │ ███████████▎ │ -- │ ██████████████████▊ │ -- │ ██████████████████████████▎ │ -- │ ███████▌ │ -- └────────────────────────────────┘
Copy code
Michael Hunger
Copy code
Expand
Share link