Correlation filter

Good evening,

I want to filter out highly correlated variables before using them in a random forest model. After the Reader and Normalizer nodes, I am using the Correlation filter to exclude variables with a correlation> |0.80| to then connect the output to the Partitioner node. However, the Correlation filter states “The dialog cannot be opened for the following reasons: no input available” (attached).

Can it be a problem with the version? I am using KNIME 4.1.3 in a Windows 10 PC.

Thank you,
Marc

Hi @MarcB -

You also need a Linear Correlation node to feed into the Correlation Filter. See the following workflow for an example:

https://kni.me/w/IrShWACT3g-uBOak

3 Likes

Thank you, Scott. However, a new message states that “some columns in the model are not contained in the input table or incomptatible: SPON2”. For the Linear correlation node I selected just quantitative variables (“D”), excluding strings and integers (one variable of each type).

That’s because you have no input data going into the correlation filter node.

As stated in the node description, that node requires 2 inputs:
image

The example workflow that @ScottF posted illustrates this.

Here’s a simpler arrangement, just to get the point across.
image

2 Likes

Ok, thank you!
Best regards,
Marc

This topic was automatically closed 182 days after the last reply. New replies are no longer allowed.