Flow variable as input for Feature Selection Filter

Dear all,

I would like to use a flow variable in the Feature Selection Filter, i.e. the flow variable (the best-fit model selected before) shall determine which model is selected in the Feature Selection Filter.
So, the Feature Selection Loop calculates several models, the GroupBy selects the model with minimum RMSE, the Table Row to Variable defines this model as flow variable (Selected features, RowID etc.) --> and then, in the Feature Selection Filter, exactly this model shall be selected.
My problem is that when I try to configure the Feature Selection Filter with the Flow variables tab, and I choose the Selected Features or RowID as flow variables, I always get the error message “Errors when overwriting node settings with flow variables: Unable to parse…”

What is the problem in my train of thoughts?

Thanks and kind regards
Ute

This is the part of the workflow, maybe it helps?
image

Hi @UteDavid,

My suspicion is that you’re trying to overwrite a boolean node setting with a string flow variable whose value is neither “true” nor “false”. However, to verify or falsify that suspicion, I’d need some more context.

To this end, could you provide me with details about

  1. the full error message you’re seeing,
  2. the type and value of the flow variable(s) you’re using (e.g., a screenshot of the flow variables tab in the output of the Table Row to Variable node) and
  3. the node settings you’re trying to overwrite (e.g., a screenshot of the flow variables tab in the configuration dialog of the Feature Selection Filter node).

Best,

Marc

1 Like

Hi Marc,

thanks for your reply!

This could be true, what you suspect.

Here’s the information:

  • error message: “Feature Selection Filter 2:849:0:626:0:661 Errors overwriting node settings with flow variables: Unable to parse “dach_comp_pmi+value movavg4 -3,dach_economic_sentiment+value movavg4 -2,dach_sts_inpr_m+value -2,dach_vdma_production_expect_vdma+value -3,dach_vdma_standing_orders+value movavg4 -3” (variable “Selected features”) as boolean expression (settings parameter “includeTargetColumn”)”.
  • Flow variables in Table Row to Var output:
  • Node settings Feature Selection Filter:
    image
    it also offers RowID.
    When I choose Nr of features, it works, but it does not necessarily choose the correct model.

Do I need a completely different approach?

Thank you!

Ute

Hi Ute,

Looks like the Feature Selection Filter node does not expose its selected columns in the flow variables tab. Thus, I don’t think you can control which model(s) to select/filter in the node via a flow variable.

The includeTargetColumn setting that you’re attempting to control via a String flow variable is indeed a boolean setting that corresponds to the “Include static columns” setting in the Column Selection tab.

Best,

Marc

Hello Ute,

unfortunately, what you are trying to do is not possible with the Feature Selection Filter node.
However, there are two alternative solutions that should work:

  1. Identify the smallest RMSE as flow variable and set the errorThreshold to that variable. If your node is in “Select features automatically by score threshold” mode, it should automatically select the features you are trying to select.

  2. With KNIME 4.1.0 we introduced a bunch of new flow variable types, including boolean and arrays of flow variables. By splitting the comma separated string representing the best feature set, and reaggregating it into a list of strings, you could convert it into a string array flow variable that can be used to configure our Column Filter node.

I believe the first approach is much easier and I’d highly recommend to try it before diving into the second approach.

Cheers,

Adrian

3 Likes

Hello you two,

thanks for your replies, that was so fast!

@nemad: I went for the 1st option, that seems to work! Great help :slight_smile:

Regards
Ute

2 Likes

This topic was automatically closed 182 days after the last reply. New replies are no longer allowed.