SQL, Python & More for DuckDB - Page 5

‌

result set → single row array of structsSQL

Execute this SQL

-- First, transform TBL (or any relation) 
-- into a single column of structs.
with structs (from TBL select TBL)
-- Then pack those structs into a list.
from structs select list(TBL) as ready_to_plot;
Copy code

Hamilton Ulmer

Created 01/09/24

‌

Remove duplicatesSQL

Execute this SQL

/* removes duplicate rows at the order_id level */
SELECT * FROM orders
QUALIFY row_number() over (partition by order_id order by created_at) = 1
Copy code

Octavian Zarzu

Created 04/27/23

‌

Show supported DuckDB extensionsSQL

Editor's note: DuckDB has many supported extensions for everything from data formats (JSON, parquet, excel, iceberg) to specific types (IP addresses, time zones) to indexing (full-text-search) and more. This table function will tell you which extensions are supported in your local DuckDB install.

Execute this SQL

-- show supported duckdb extensions
FROM duckdb_extensions();
Copy code

Mehdi Ouazza

Edited 02/23/24

‌

ATTACH 'other.db';SQL

Editor's note: DuckDB allows you to attach multiple databases at once. For example, you can attach a local file, an in-memory database and a database from MotherDuck and work with all of them simultaneously. The ATTACH statement is executed for each database to be attached.

Execute this SQL

-- attach another database, alias inferred from the name ("other")
ATTACH 'other.db';
SELECT * FROM other.some_table;
Copy code

Carlo Piovesan

Edited 02/23/24

‌

Query Parquet data in S3Bash

Editor's note: DuckDB users often work with files in Parquet format, which has become a standard for representing data in data lakes. While DuckDB lets you work with local Parquet files, you can also use files stored in blob storage such as Amazon AWS S3, Azure Blob Storage and Google Cloud Storage.

Execute this Bash

# Assuming you have the following environment variables defined:
# AWS_ACCESS_KEY_ID
# AWS_SECRET_ACCESS_KEY
# AWS_DEFAULT_REGION
duckdb -c 'LOAD httpfs; SELECT count(*) FROM read_parquet("s3://<bucket>/<prefix>/*.parquet");'
Copy code

Damon

Edited 02/23/24

‌

No more error on on end of line commasSQL

Editor's note: this query demonstrates default behavior for DuckDB, but boy does it make it easier to comment your SQL lines out without fail.

Execute this SQL

# 🤓 Courtesy of Michael Simons (aka. @rotnroll666)
# 🐦 https://twitter.com/rotnroll666/status/1671066790368010241

SELECT foo,
    bar,
    # hello,
    world,
    # dummy,
FROM bazbar;
Copy code

SALES

Edited 02/23/24

‌

City air quality insights based on WHO data

Editor's note: Mehdi's database share includes air quality data from the World Health Organization (WHO). Use his example queries to understand pollution in particular areas. You might even try combining with the spatial extension discussed in other snippets.

Attach and select MotherDuck database
Data shared/available on MotherDuck

ATTACH 'md:_share/sample_data/23b0d623-1361-421d-ae77-62d701d471e6' AS sample_data;
USE sample_data;
Copy code

Annual city air quality rating based on WHO dataSQL

SELECT
    city,
    year,
    CASE
        WHEN
            AVG(pm25_concentration) <= 10
            AND AVG(pm10_concentration) <= 20
            AND AVG(no2_concentration) <= 40
            THEN 'Good'
        WHEN
            AVG(pm25_concentration) > 10
            AND AVG(pm10_concentration) > 20
            AND AVG(no2_concentration) > 40
            THEN 'Poor'
        ELSE 'Moderate'
    END AS airqualityrating
FROM
    sample_data.who.ambient_air_quality
GROUP BY
    city,
    year
ORDER BY
    city,
    year;
Copy code

Yearly average pollutant concentrations of a city (Berlin)SQL

SELECT
    year,
    AVG(pm25_concentration) AS avg_pm25,
    AVG(pm10_concentration) AS avg_pm10,
    AVG(no2_concentration) AS avg_no2
FROM sample_data.who.ambient_air_quality 
WHERE city = 'Berlin'
GROUP BY year
ORDER BY year DESC;
Copy code

Mehdi Ouazza

Edited 02/23/24

result set → single row array of structsSQL

Execute this SQL

Remove duplicatesSQL

Execute this SQL

Show supported DuckDB extensionsSQL

Execute this SQL

ATTACH 'other.db';SQL

Execute this SQL

Query Parquet data in S3Bash

Execute this Bash

No more error on on end of line commasSQL

Execute this SQL

City air quality insights based on WHO data

Attach and select MotherDuck databaseData shared/available on MotherDuck

Annual city air quality rating based on WHO dataSQL

Yearly average pollutant concentrations of a city (Berlin)SQL

Attach and select MotherDuck database
Data shared/available on MotherDuck