Back to Code Snippets

Mehdi Ouazza

@mehdio


City air quality insights based on WHO data

Editor's note: Mehdi's database share includes air quality data from the World Health Organization (WHO). Use his example queries to understand pollution in particular areas. You might even try combining with the spatial extension discussed in other snippets.

Attach and select MotherDuck database

Data shared/available on MotherDuck

ATTACH 'md:_share/sample_data/23b0d623-1361-421d-ae77-62d701d471e6' AS sample_data;
USE sample_data;

Copy code

Annual city air quality rating based on WHO dataSQL

SELECT
    city,
    year,
    CASE
        WHEN
            AVG(pm25_concentration) <= 10
            AND AVG(pm10_concentration) <= 20
            AND AVG(no2_concentration) <= 40
            THEN 'Good'
        WHEN
            AVG(pm25_concentration) > 10
            AND AVG(pm10_concentration) > 20
            AND AVG(no2_concentration) > 40
            THEN 'Poor'
        ELSE 'Moderate'
    END AS airqualityrating
FROM
    sample_data.who.ambient_air_quality
GROUP BY
    city,
    year
ORDER BY
    city,
    year;

Copy code

Yearly average pollutant concentrations of a city (Berlin)SQL

SELECT
    year,
    AVG(pm25_concentration) AS avg_pm25,
    AVG(pm10_concentration) AS avg_pm10,
    AVG(no2_concentration) AS avg_no2
FROM sample_data.who.ambient_air_quality 
WHERE city = 'Berlin'
GROUP BY year
ORDER BY year DESC;

Copy code

Mehdi Ouazza

Expand

Share link


Query Yahoo Finance data with Scrooge extensionSQL

Editor's note: The Scrooge McDuck extension provides common aggregation functions and data scanners used for financial data. This example grabs stock quotes (or the S&P 500 in this case) on specific dates for analysis in DuckDB.

Execute this SQL

-- Install httpfs extension
INSTALL httpfs;
LOAD httpfs;
-- Install Scrooge extension https://github.com/pdet/Scrooge-McDuck
-- NOTE: You need to start DuckDB with `-unsigned` flag to authorize to install & load of 3rd party extension
SET custom_extension_repository='scrooge-duckdb.s3.us-west-2.amazonaws.com/scrooge/s3_deploy';
INSTALL scrooge;
LOAD scrooge;
-- Example of query 
FROM yahoo_finance("^GSPC", "2023-02-01", "2023-02-04", "1d");

Copy code

Mehdi Ouazza

Expand

Share link


Quickly convert a CSV to Parquet, bash functionBash

Editor's note: DuckDB makes it easy to convert between a variety of popular data formats (CSV, JSON, Parquet, and more) using simple SQL statements. It's also easy to execute these statements from a bash shell so you have them ready to go.

Execute this Bash

#!/bin/bash 
function csv_to_parquet() {     
    file_path="$1"     
    duckdb -c "COPY (SELECT * FROM read_csv_auto('$file_path')) TO '${file_path%.*}.parquet' (FORMAT PARQUET);" }

Copy code

Mehdi Ouazza

Expand

Share link


Allow unsigned extensions (own extensions or untrusted third-parties)Bash

Editor's note: While DuckDB has 20+ official extensions, it also allows you to write your own extensions. In order load extensions from unofficial repositories, you'll need to let DuckDB know that it's safe to do so. This can be done using the allow_unsigned_extensions flag or -unsigned for the CLI.

Execute this Bash

duckdb -unsigned

Copy code

Mehdi Ouazza

Expand

Share link


Show supported DuckDB extensionsSQL

Editor's note: DuckDB has many supported extensions for everything from data formats (JSON, parquet, excel, iceberg) to specific types (IP addresses, time zones) to indexing (full-text-search) and more. This table function will tell you which extensions are supported in your local DuckDB install.

Execute this SQL

-- show supported duckdb extensions
FROM duckdb_extensions();

Copy code

Mehdi Ouazza

Expand

Share link