How does the knime correlation filter works?

Hi , i have just started using knime recently and tried using correlation filter to streamline my data.

I am curious on how does the node actually interpret the threshold value inputted to it to filter out the data ?

Thanks

Hi @Tseraga,

The Correlation Filter node looks for the column with the most correlated columns in an iterative procedure. The threshold you set in the configuration dialog actually determines the definition of “correlated column”. That is, two columns are said to be correlated if their correlation coefficient is greater than the threshold.

If you are interested in an example, take a look at https://workflows.knime.com/knime/hub/workflows/50_Applications%3A21_Model_Selection_and_Management%3A01_Model_Selection_Sampled.

Best,
Stefan

Hi,
What are the criteria for the filtration except for the threshold? does it sum up all correlations for each row, for example (each property), and for two properties with a correlation above the threshold it eliminates the one with the higher number of correlations (sum for each row?) as it is more correlated with others? and if it eliminates like that, does it affect the subsequent rows and those that were correlated to this property will be kept because the specific property was eliminated?