Pretty-printing floating-pont numbers
By casting the result to a `DECIMAL` with a fixed number of digits after the decimal point, we can pretty-print it while keeping it as a numeric value.
Execute this SQL
COPY (FROM VALUES ('foo', 10/9), ('bar', 50/7), ('qux', 9/4) t(s, x)) TO 'example.csv'; SELECT x::DECIMAL(15, 3) AS x FROM 'example.csv';
Copy code
Expand
Share link
Copying the schema of a table
We can use `LIMIT` to copy the schema of an existing table without repeating the specification of its column names/types.
Execute this SQL
COPY (FROM VALUES ('foo', 10/9), ('bar', 50/7), ('qux', 9/4) t(s, x)) TO 'example.csv'; CREATE TABLE tbl AS FROM example LIMIT 0;
Copy code
Expand
Share link
Deterministically shuffling data
Sometimes it's beneficial to shuffle data sets in a deterministic, repeatable way. We can achieve this by using the `hash` function on the `rowid` pseudocolumn.
Execute this SQL
COPY (FROM VALUES ('foo', 10/9), ('bar', 50/7), ('qux', 9/4) t(s, x)) TO 'example.csv'; CREATE OR REPLACE TABLE example AS FROM 'example.csv'; FROM example ORDER BY hash(rowid + 42);
Copy code
Expand
Share link
Updating CSV files in-place
DuckDB allows updating CSV files in-place.
Execute this SQL
COPY (FROM VALUES ('foo', 10/9), ('bar', 50/7), ('qux', 9/4) t(s, x)) TO 'example.csv'; COPY (SELECT s FROM 'example.csv') TO 'example.csv';
Copy code
Expand
Share link
Query from Google Sheets
Sometimes you have data over in Google Sheets and you want quickly and easily load it into DuckDB. With read_csv and the Google Sheets 'export to csv', its just one line of code.
Execute this SQL
-- you will need to create a sharelink for your sheet. -- note the tab id (gid) and sheet id (in the url) from read_csv('https://docs.google.com/spreadsheets/d/{sheet_id}/export?format=csv&gid={tab_id}')
Copy code
Expand
Share link
Split a String and cast all elements to a desired datatypeSQL
Execute this SQL
-- Here we make use of list_transform and -- the fact that DuckDB knows lambda expressions SELECT list_transform(split('01:08:22', ':'), x -> CAST (x AS INTEGER));
Copy code
Copy code
Expand
Share link
Convert CSV to Parquet and amend the schemaSQL
Execute this SQL
-- This selects only specific fields from the CSV -- It also renames one of the fields in the target schema COPY (SELECT col1, col2, col3 AS new_col_name FROM read_csv('~/data/source.csv',AUTO_DETECT=TRUE)) TO '~/data/target.parquet' (FORMAT 'PARQUET', CODEC 'ZSTD');
Copy code
Copy code
Expand
Share link
Query Parquet files and their metadataSQL
Execute this SQL
-- listing files FROM glob('dataset/*'); -- reading from files FROM 'dataset/*.parquet' LIMIT 100; -- reading parquet files metadata (min_value, max_value, null_count for each field in each file) FROM parquet_metadata('dataset/*.parquet'); -- convert files or export tables to parquet COPY (FROM tbl) TO 'file.parquet' (FORMAT 'parquet'); COPY 'data.csv' TO 'data.paruqet' (FORMAT 'parquet');
Copy code
Copy code
Expand
Share link