"Category to Number" automatically skip those within the "max categories"

“Category to Number” currently stops if it finds a column that has more categorical values than specified in the “max categories” field.

(1) I would prefer a way to list all such columns in one go
(2) Skip all such columns and move ahead with the node execution and then give a report (is this included in PMML?)

I have 100s of columns and is getting difficult to go over each of them for conversion

Hi @tims -

This is probably achievable using a combination of Column List Loop Start, to force a single column at a time, and the Try/Catch construct to prevent the workflow from stopping (Try (Data Ports) & Catch Errors (Data Ports)).

Maybe something like this gets you started?

I haven’t tried it myself yet as I don’t have a dataset that would be handy for testing in this particular case. If you have one, maybe upload a bit of it and we can find a workaround?

1 Like

@ScottF … thanks for the first version

I am still running into errors. tried to upload the workflow with sample data, but could not “remove properties and personal information” from the project file using Windows 10 properties dialog.

Here’s some data:
image

data has 3 categories (a,b,c), but I set max categories to 2

image

ERROR Category To Number 4:51 Execution failed in Try-Catch block: Maximum number of categories reached for column: column1 (to number)

your support is appreciated!

Let’s give this a shot:

TryCatchColumnAppendExample.knwf (34.0 KB)

I tried to document the nodes so that it’s clear what they do. I did use a bit of flow variables that maybe aren’t as clear, so let me know if you have questions about how it works.

Basically we use a dummy table to populate when there is a failure in the Category to Number node. We then take the name of the failed column and append it with a “failed_” signifier, so that we can split our columns in the end and easily identify which ones didn’t work.

1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

@ScottF

Still struggling to adapt this to my needs.

I would like the output of “Category to Number” to code whatever columns it can and leave rest of the data that cannot be coded as it is.

OK, that sounds like a feature request then if I’m understanding you right. As it’s currently implemented, I don’t think there’s a way for the node itself to do what you’re looking for.

1 Like