Back to Code Snippets
Discretize numeric column following specified thresholdsSQL
discretize() converts a numeric column into discrete ordered ids, taking into account a list of thresholds.
Execute this SQL
CREATE OR REPLACE MACRO discretize(v, l) AS ( WITH t1 AS ( SELECT unnest(list_distinct(l)) as j ), t2 AS ( SELECT COUNT(*) + 1 c FROM t1 WHERE try_cast(j AS float) <= v ) FROM t2 SELECT IF(v IS NULL, NULL, c) ) ; --Usage FROM 'https://raw.githubusercontent.com/thewiremonkey/factbook.csv/master/data/c2127.csv' SELECT name, value, discretize(value, [2,3,4,5]) AS class ;
Copy code
Éric Mauvière
Expand
Share link