Back to Code Snippets
Alejandro Wainzinger
@xevix
List Hive PartitionsSQL
Given Hive partitioned data, list all the keys, values and files. Handy to get a summary of the structure of Hive data.
Execute this SQL
WITH hive_cols AS ( PIVOT ( SELECT file.split('/').list_filter(lambda x: x.contains('=')).list_transform(lambda x: x.split('=')).unnest() AS kv, file FROM glob('/path/to/hive/**/*.parquet') ) ON kv[1] USING first(kv[2]) ) FROM hive_cols;
Copy code
Alejandro Wainzinger
Expand
Share link