Back to Code Snippets

Alejandro Wainzinger

@xevix


List Hive PartitionsSQL

Given Hive partitioned data, list all the keys, values and files. Handy to get a summary of the structure of Hive data.

Execute this SQL

WITH hive_cols AS (
  PIVOT (
  SELECT
    file.split('/').list_filter(lambda x: x.contains('=')).list_transform(lambda x: x.split('=')).unnest() AS kv,
    file
  FROM glob('/path/to/hive/**/*.parquet')
  )
  ON kv[1]
  USING first(kv[2])
)
FROM hive_cols;

Copy code

Alejandro Wainzinger

Expand

Share link