I have an existing workflow. I insert a new dataset at the beginning, say for model scoring, and the new dataset has a few new variables in it that were not present in model training.
I find that these new variables are automatically inserted into some dialog boxes causing all sorts of mystery mistakes.
Is there a way to set a preference so that new fields are not placed in dialog function lists unless I place them there?
Thank you for the suggestion. I see the option in the dialog boxes, but there is no explanation of this functionality in the dialog “?”. I searched for it in Help but also came up empty “Local Help (0 hits)”.
Now I wonder if there is a way to change the dialog box default to Enforce Inclusion. Not having a definition until now, I just went with the defaults. Now that I know the meaning of the feature, I would like to change it once, rather than every time I place the dialog box in the workflow. Probably wishful thinking, but do you have any hints on that?
there should be explanation in node description I believe. What node are you talking about? However if you use Column Filter node after new dataset is read, configure it with Enforce Inclusion option and choose columns on which you trained your workflow you don’t need to configure it every time.
If data used for model training is in same workflow as new dataset you can use Reference Column Filter node to ensure new variables (columns) are not taken into account.
Now that you mention it, I can see that Enforce Inclusion is documented in the Column Filter help button. However, it is not documented every place that it is available. For example, in String To Number, Enforce Inclusion is available, but the help button makes no mention of it.
Possible that somewhere isn’t documented properly as this Column Selection dialog with Enforce exclusion/inclusion options is used in many nodes. However you can also hoover over this option and will get info about it.