β€Œ

Working with public REST API's

Editor's note: by combining the read_json capability of DuckDB with the ability to retrieve arbitrary https URLs, you can easily use DuckDB to read data from public and unprotected APIs. Note that you can't currently pass arbitrary HTTP headers to access protected APIs.

Query JSON from a public REST APISQL

-- DuckDB can directly query a public REST API returning a JSON 
-- Query the "tvmaze.com" API for the TV show "South Park"

SELECT name, type, summary
FROM read_json('https://api.tvmaze.com/singlesearch/shows?q=South%20Park', 
auto_detect=true, format='newline_delimited');

Copy code

Format the response from a public REST API SQL

-- Format the response from a public REST API 
-- Retrieve a list of temperature readings for Sydney, Australia 
-- The returned payload has hourly temps for 24 hours
-- We can use JSONPath to extract data from a JSON documemnt 
-- to find the most recent temperature

SELECT json_extract(hourly, '$.temperature_2m[1]')
from read_json('https://api.open-meteo.com/v1/forecast?latitude=-33.8678&longitude=151.2073&hourly=temperature_2m&forecast_days=1', auto_detect=true, format='newline_delimited');

Copy code

Simon Aubury

Expand

Share link

β€Œ

Discretize numeric column following specified thresholdsSQL

discretize() converts a numeric column into discrete ordered ids, taking into account a list of thresholds.

Execute this SQL

CREATE OR REPLACE MACRO discretize(v, l) AS (
	WITH t1 AS (
		SELECT unnest(list_distinct(l)) as j
	), t2 AS (
		SELECT COUNT(*) + 1 c FROM t1 
	  WHERE try_cast(j AS float) <= v
	) FROM t2 
	SELECT IF(v IS NULL, NULL, c) 
) ;

--Usage
FROM 'https://raw.githubusercontent.com/thewiremonkey/factbook.csv/master/data/c2127.csv'
SELECT name, value, discretize(value, [2,3,4,5]) AS class ;

Copy code

Éric Mauvière

Expand

Share link

Execute this SQL

-- First, transform TBL (or any relation) 
-- into a single column of structs.
with structs (from TBL select TBL)
-- Then pack those structs into a list.
from structs select list(TBL) as ready_to_plot;

Copy code

Hamilton Ulmer

Copy code

Expand

Share link

Execute this SQL

/* removes duplicate rows at the order_id level */
SELECT * FROM orders
QUALIFY row_number() over (partition by order_id order by created_at) = 1

Copy code

Octavian Zarzu

Copy code

Expand

Share link

β€Œ

Show supported DuckDB extensionsSQL

Editor's note: DuckDB has many supported extensions for everything from data formats (JSON, parquet, excel, iceberg) to specific types (IP addresses, time zones) to indexing (full-text-search) and more. This table function will tell you which extensions are supported in your local DuckDB install.

Execute this SQL

-- show supported duckdb extensions
FROM duckdb_extensions();

Copy code

Mehdi Ouazza

Expand

Share link

β€Œ

ATTACH 'other.db';SQL

Editor's note: DuckDB allows you to attach multiple databases at once. For example, you can attach a local file, an in-memory database and a database from MotherDuck and work with all of them simultaneously. The ATTACH statement is executed for each database to be attached.

Execute this SQL

-- attach another database, alias inferred from the name ("other")
ATTACH 'other.db';
SELECT * FROM other.some_table;

Copy code

Carlo Piovesan

Expand

Share link

β€Œ

Query Parquet data in S3Bash

Editor's note: DuckDB users often work with files in Parquet format, which has become a standard for representing data in data lakes. While DuckDB lets you work with local Parquet files, you can also use files stored in blob storage such as Amazon AWS S3, Azure Blob Storage and Google Cloud Storage.

Execute this Bash

# Assuming you have the following environment variables defined:
# AWS_ACCESS_KEY_ID
# AWS_SECRET_ACCESS_KEY
# AWS_DEFAULT_REGION
duckdb -c 'LOAD httpfs; SELECT count(*) FROM read_parquet("s3://<bucket>/<prefix>/*.parquet");'

Copy code

Damon

Expand

Share link

β€Œ

No more error on on end of line commasSQL

Editor's note: this query demonstrates default behavior for DuckDB, but boy does it make it easier to comment your SQL lines out without fail.

Execute this SQL

# πŸ€“ Courtesy of Michael Simons (aka. @rotnroll666)
# 🐦 https://twitter.com/rotnroll666/status/1671066790368010241

SELECT foo,
    bar,
    # hello,
    world,
    # dummy,
FROM bazbar;

Copy code

SALES

Expand

Share link