Back to Code Snippets
Query Parquet data in S3Bash
Editor's note: DuckDB users often work with files in Parquet format, which has become a standard for representing data in data lakes. While DuckDB lets you work with local Parquet files, you can also use files stored in blob storage such as Amazon AWS S3, Azure Blob Storage and Google Cloud Storage.
Execute this Bash
# Assuming you have the following environment variables defined: # AWS_ACCESS_KEY_ID # AWS_SECRET_ACCESS_KEY # AWS_DEFAULT_REGION duckdb -c 'LOAD httpfs; SELECT count(*) FROM read_parquet("s3://<bucket>/<prefix>/*.parquet");'
Copy code
Damon
Expand
Share link