Back to Code Snippets


Query Parquet data in S3

Editor's note: DuckDB users often work with files in Parquet format, which has become a standard for representing data in data lakes. While DuckDB lets you work with local Parquet files, you can also use files stored in blob storage such as Amazon AWS S3, Azure Blob Storage and Google Cloud Storage.

Execute this Bash

# Assuming you have the following environment variables defined:
# AWS_ACCESS_KEY_ID
# AWS_SECRET_ACCESS_KEY
# AWS_DEFAULT_REGION
duckdb -c 'LOAD httpfs; SELECT count(*) FROM read_parquet("s3://<bucket>/<prefix>/*.parquet");'

Copy code

Damon

Expand

Share link