Working with public REST API's
Editor's note: by combining the read_json capability of DuckDB with the ability to retrieve arbitrary https URLs, you can easily use DuckDB to read data from public and unprotected APIs. Note that you can't currently pass arbitrary HTTP headers to access protected APIs.
Query JSON from a public REST APISQL
-- DuckDB can directly query a public REST API returning a JSON -- Query the "tvmaze.com" API for the TV show "South Park" SELECT name, type, summary FROM read_json('https://api.tvmaze.com/singlesearch/shows?q=South%20Park', auto_detect=true, format='newline_delimited');
Copy code
Format the response from a public REST API SQL
-- Format the response from a public REST API -- Retrieve a list of temperature readings for Sydney, Australia -- The returned payload has hourly temps for 24 hours -- We can use JSONPath to extract data from a JSON documemnt -- to find the most recent temperature SELECT json_extract(hourly, '$.temperature_2m[1]') from read_json('https://api.open-meteo.com/v1/forecast?latitude=-33.8678&longitude=151.2073&hourly=temperature_2m&forecast_days=1', auto_detect=true, format='newline_delimited');
Copy code
Expand
Share link
City air quality insights based on WHO data
Editor's note: Mehdi's database share includes air quality data from the World Health Organization (WHO). Use his example queries to understand pollution in particular areas. You might even try combining with the spatial extension discussed in other snippets.
Annual city air quality rating based on WHO dataSQL
SELECT city, year, CASE WHEN AVG(pm25_concentration) <= 10 AND AVG(pm10_concentration) <= 20 AND AVG(no2_concentration) <= 40 THEN 'Good' WHEN AVG(pm25_concentration) > 10 AND AVG(pm10_concentration) > 20 AND AVG(no2_concentration) > 40 THEN 'Poor' ELSE 'Moderate' END AS airqualityrating FROM sample_data.who.ambient_air_quality GROUP BY city, year ORDER BY city, year;
Copy code
Yearly average pollutant concentrations of a city (Berlin)SQL
SELECT year, AVG(pm25_concentration) AS avg_pm25, AVG(pm10_concentration) AS avg_pm10, AVG(no2_concentration) AS avg_no2 FROM sample_data.who.ambient_air_quality WHERE city = 'Berlin' GROUP BY year ORDER BY year DESC;
Copy code
Expand
Share link
Top 25 Songs at Number 1, Billboard Hot 100SQL
Editor's note: do you like looking at data around the top music, including when an album debuted and when it appeared on the Billboard Hot 100 list? David has you covered with some simple analytical SQL, along with a MotherDuck share of the data.
Execute this SQL
-- Switch to the attached music database use music; -- Top 25 songs at number 1, Billboard Hot 100 SELECT song, performer, COUNT(*) AS weeks_at_no_1, chart_debut, MIN(chart_date) AS first_chart_date_as_no_1, MAX(chart_date) AS last_chart_date_as_no_1 FROM billboard_hot_100 WHERE chart_position = 1 GROUP BY ALL ORDER BY weeks_at_no_1 DESC, chart_debut LIMIT 25;
Copy code
Expand
Share link
DuckDB in Action: Some neat DuckDB specific SQL extension
Editor's note: DuckDB strives to make it easy to write SQL, even when it requires introducing non-standard syntax. See the great blog posts by Alex Monahan or explore the Manning "DuckDB in Action" book by the author of this snippet.
DuckDB specific extensions: Project all columns matching a patternSQL
SELECT COLUMNS('valid.*') FROM prices LIMIT 3;
Copy code
DuckDB specific extensions: Apply an aggregation to several columnsSQL
SELECT max(COLUMNS('valid.*')) FROM prices;
Copy code
DuckDB specific extensions: Apply one condition to many columnsSQL
FROM prices WHERE COLUMNS('valid.*') BETWEEN '2020-01-01' AND '2021-01-01';
Copy code
Expand
Share link