Decreasing Dataset Dimensionality


I have a nominal (unordered) categorical predictor variable in my dataset that has too many levels I’d like to bin or group with respect to my interval scaled dependent (output) variable.

There are quite a few ways to discretize interval scaled inputs with respect to the categorical output like the great CAIM algorithm found in KNIME, but I’m having trouble finding the procedure that will do opposite (bin categorical input with respect to interval output).

Any tips would be greatly appreciated !!

Hi Paul,
Not really, but there is a node called String Replace (Dictionary) that might help to bin categorical data; give it a try…
Regards, Thomas