“Category to Number” currently stops if it finds a column that has more categorical values than specified in the “max categories” field.
(1) I would prefer a way to list all such columns in one go
(2) Skip all such columns and move ahead with the node execution and then give a report (is this included in PMML?)
I have 100s of columns and is getting difficult to go over each of them for conversion
I haven’t tried it myself yet as I don’t have a dataset that would be handy for testing in this particular case. If you have one, maybe upload a bit of it and we can find a workaround?
I am still running into errors. tried to upload the workflow with sample data, but could not “remove properties and personal information” from the project file using Windows 10 properties dialog.
I tried to document the nodes so that it’s clear what they do. I did use a bit of flow variables that maybe aren’t as clear, so let me know if you have questions about how it works.
Basically we use a dummy table to populate when there is a failure in the Category to Number node. We then take the name of the failed column and append it with a “failed_” signifier, so that we can split our columns in the end and easily identify which ones didn’t work.
OK, that sounds like a feature request then if I’m understanding you right. As it’s currently implemented, I don’t think there’s a way for the node itself to do what you’re looking for.