I have a categorical feature to do One Hot Encoding, and I tried using One to Many node. It works fine for training and evaluating, but I couldn’t figure out how to “apply” it to new data.
Let’s say my feature has (on training set) 4 distinct values: A, B, C and D. I use One to Many node to generate 4 columns, then I train a model on it.
My pipeline for new data should perform the same transformation, so I have the same 4 columns to feed my trained model.
But if I use One to Many on new data, let’s say I have a single row of data with value C on my feature. I will be missing columns A, B and D. How can I make sure the corresponding columns are created in order to feed the model?
I am relatively new to KNIME, so maybe I am missing something…