Working with public REST API's
Editor's note: by combining the read_json capability of DuckDB with the ability to retrieve arbitrary https URLs, you can easily use DuckDB to read data from public and unprotected APIs. Note that you can't currently pass arbitrary HTTP headers to access protected APIs.
Query JSON from a public REST APISQL
-- DuckDB can directly query a public REST API returning a JSON -- Query the "tvmaze.com" API for the TV show "South Park" SELECT name, type, summary FROM read_json('https://api.tvmaze.com/singlesearch/shows?q=South%20Park', auto_detect=true, format='newline_delimited');Copy code
Format the response from a public REST API SQL
-- Format the response from a public REST API -- Retrieve a list of temperature readings for Sydney, Australia -- The returned payload has hourly temps for 24 hours -- We can use JSONPath to extract data from a JSON documemnt -- to find the most recent temperature SELECT json_extract(hourly, '$.temperature_2m[1]') from read_json('https://api.open-meteo.com/v1/forecast?latitude=-33.8678&longitude=151.2073&hourly=temperature_2m&forecast_days=1', auto_detect=true, format='newline_delimited');Copy code
Expand
Share link
Discretize numeric column following specified thresholdsSQL
discretize() converts a numeric column into discrete ordered ids, taking into account a list of thresholds.
Execute this SQL
CREATE OR REPLACE MACRO discretize(v, l) AS ( WITH t1 AS ( SELECT unnest(list_distinct(l)) as j ), t2 AS ( SELECT COUNT(*) + 1 c FROM t1 WHERE try_cast(j AS float) <= v ) FROM t2 SELECT IF(v IS NULL, NULL, c) ) ; --Usage FROM 'https://raw.githubusercontent.com/thewiremonkey/factbook.csv/master/data/c2127.csv' SELECT name, value, discretize(value, [2,3,4,5]) AS class ;Copy code
Expand
Share link
Execute this SQL
-- First, transform TBL (or any relation) -- into a single column of structs. with structs (from TBL select TBL) -- Then pack those structs into a list. from structs select list(TBL) as ready_to_plot;Copy code
Copy code
Expand
Share link
Execute this SQL
/* removes duplicate rows at the order_id level */ SELECT * FROM orders QUALIFY row_number() over (partition by order_id order by created_at) = 1Copy code
Copy code
Expand
Share link
Show supported DuckDB extensionsSQL
Editor's note: DuckDB has many supported extensions for everything from data formats (JSON, parquet, excel, iceberg) to specific types (IP addresses, time zones) to indexing (full-text-search) and more. This table function will tell you which extensions are supported in your local DuckDB install.
Execute this SQL
-- show supported duckdb extensions FROM duckdb_extensions();Copy code
Expand
Share link
ATTACH 'other.db';SQL
Editor's note: DuckDB allows you to attach multiple databases at once. For example, you can attach a local file, an in-memory database and a database from MotherDuck and work with all of them simultaneously. The ATTACH statement is executed for each database to be attached.
Execute this SQL
-- attach another database, alias inferred from the name ("other") ATTACH 'other.db'; SELECT * FROM other.some_table;Copy code
Expand
Share link
Query Parquet data in S3Bash
Editor's note: DuckDB users often work with files in Parquet format, which has become a standard for representing data in data lakes. While DuckDB lets you work with local Parquet files, you can also use files stored in blob storage such as Amazon AWS S3, Azure Blob Storage and Google Cloud Storage.
Execute this Bash
# Assuming you have the following environment variables defined: # AWS_ACCESS_KEY_ID # AWS_SECRET_ACCESS_KEY # AWS_DEFAULT_REGION duckdb -c 'LOAD httpfs; SELECT count(*) FROM read_parquet("s3://<bucket>/<prefix>/*.parquet");'Copy code
Expand
Share link
No more error on on end of line commasSQL
Editor's note: this query demonstrates default behavior for DuckDB, but boy does it make it easier to comment your SQL lines out without fail.
Execute this SQL
# π€ Courtesy of Michael Simons (aka. @rotnroll666) # π¦ https://twitter.com/rotnroll666/status/1671066790368010241 SELECT foo, bar, # hello, world, # dummy, FROM bazbar;Copy code
Expand
Share link