Open the HTML version of an EXPLAIN statementBash
DuckDB can output the plan in different ways, e.g. HTML or JSON. It's hard to easily view this when in the DuckDB CLI, but in non-interactive mode we can pass the result to a HTML file directly and open it.
Execute this Bash
duckdb my_complex_db.duckdb -c " EXPLAIN (ANALYZE, FORMAT html) INSERT INTO big_data.complex_db (key_to_update, values, event_time) SELECT key_to_update, values, event_time FROM big_data.new_data WHERE event_time >= '2026-01-01'::date " > output.html && open output.htmlCopy code
Expand
Share link
Fetch HackerNews Stories with read_json()
You can read JSON from any endpoint into DuckDB. This uses read_json() to fetch all HackerNews 'Who is hiring' stories and create a table with all stories and the comment IDs. It works by: - creating a list of URLs - reading the list and creating the table - querying the table
Create JSON URLsSQL
-- https://hn.algolia.com/api with urls as ( select 'https://hn.algolia.com/api/v1/search_by_date?tags=author_whoishiring&page=' || page_num || '&query=hiring?&attributesToRetrieve=title,tags,comment_text,created_at,object_id,story_id,story_title,children,story_text' as url from unnest(range(0,10)) as t(page_num) ) select array_to_json(list(url)) from urlsCopy code
Read JSON from API endpointSQL
create or replace table hn_stories as ( select hit.title as story_title, hit.story_text, hit.created_at, hit.objectId, hit.children, from read_json( -- your_url_list_from_above -- ), unnest(hits) as t(hit) )Copy code
Expand
Share link
Multiple String ReplacementSQL
A macro to easily run multiple string replacements iteratively without function nesting.
Execute this SQL
CREATE MACRO replace_many (string, replacements) AS list_reduce( replacements, lambda acc, replacement: [ -- We wrap this in a list for type safety replace(acc[1], replacement[1], replacement[2]) ], [string] )[1]; -- Usage example SELECT replace_many('Foo Bar', [['Foo', 'foofoo'], ['Bar', 'BarBar']]);Copy code
Expand
Share link
List Hive PartitionsSQL
Given Hive partitioned data, list all the keys, values and files. Handy to get a summary of the structure of Hive data.
Execute this SQL
WITH hive_cols AS ( PIVOT ( SELECT file.split('/').list_filter(lambda x: x.contains('=')).list_transform(lambda x: x.split('=')).unnest() AS kv, file FROM glob('/path/to/hive/**/*.parquet') ) ON kv[1] USING first(kv[2]) ) FROM hive_cols;Copy code
Expand
Share link
Top 50 Most Queried Domains from DNS A Records in PCAP AnalysisSQL
This query parses a PCAP file to identify the top 50 most requested domains from DNS A record queries. It filters DNS packets, extracts domains from the _ws.col.info field, normalizes them to lowercase, counts occurrences, and sorts the results by frequency in descending order.
Execute this SQL
SELECT LOWER(REGEXP_EXTRACT("_ws.col.info", ' A ([^ ]+)', 1)) AS domain, COUNT(*) AS total FROM read_pcap('dump.pcap') WHERE "frame.protocols" LIKE '%dns%' AND "_ws.col.info" LIKE '% A %' GROUP BY domain ORDER BY total DESC LIMIT 50;Copy code
Expand
Share link
Use method chaining to filter tables
This is just a simple example of how to implement method chaining using `list` and `unnest` without any other extensions to provide a functional interface.
ExampleSQL
SELECT list(t) .list_filter(lambda c: c.registration_date >= '2023-02-01') .list_transform(lambda c: struct_insert(c, is_north_american:=c.country in ('USA', 'Mexico', 'Canada'))) .list_filter(lambda c: c.is_north_american) .unnest(recursive:=true) FROM 'customers.csv' AS t;Copy code
customers.csvBash
customer_id,name,email,age,tier,country,registration_date,lifetime_value 1,Alice Johnson,alice.johnson@email.com,29,Premium,USA,2023-01-15,2847.50 2,Bob Smith,bob.smith@email.com,34,Standard,Canada,2023-02-03,1256.80 3,Charlie Brown,charlie.brown@email.com,42,Premium,UK,2023-01-22,3421.90 4,Diana Ross,diana.ross@email.com,38,Gold,USA,2023-03-10,5678.25 5,Eve Wilson,eve.wilson@email.com,26,Standard,Australia,2023-02-18,845.60 6,Frank Miller,frank.miller@email.com,31,Premium,Germany,2023-01-08,2156.75 7,Grace Lee,grace.lee@email.com,28,Standard,Canada,2023-03-05,967.40 8,Henry Davis,henry.davis@email.com,45,Gold,USA,2023-01-29,4532.80 9,Ivy Chen,ivy.chen@email.com,33,Premium,Singapore,2023-02-14,2890.45 10,Jack Wilson,jack.wilson@email.com,39,Standard,UK,2023-03-12,1134.20 11,Kate Brown,kate.brown@email.com,27,Premium,Australia,2023-01-18,2234.85 12,Leo Garcia,leo.garcia@email.com,36,Gold,Spain,2023-02-27,3789.60 13,Mia Taylor,mia.taylor@email.com,30,Standard,USA,2023-03-08,1089.95 14,Noah Kim,noah.kim@email.com,41,Premium,South Korea,2023-01-25,2567.30 15,Olivia Jones,olivia.jones@email.com,35,Gold,Canada,2023-02-11,4123.75 16,Paul Anderson,paul.anderson@email.com,32,Standard,UK,2023-03-15,876.40 17,Quinn Lee,quinn.lee@email.com,29,Premium,Singapore,2023-01-12,2678.95 18,Rachel Green,rachel.green@email.com,37,Gold,USA,2023-02-06,3956.80 19,Sam Wilson,sam.wilson@email.com,28,Standard,Australia,2023-03-01,1245.30 20,Tina Rodriguez,tina.rodriguez@email.com,44,Premium,Mexico,2023-01-31,2345.65Copy code
Expand
Share link