Feature selection visualization

Nilou · September 29, 2022, 9:25am

Hi,

Is there a way to show global feature importance for regression models too? like the component that is provided for classifiers.
If not, a related question is:
Can we add a visualization node to “feature selection filter” node to see how each feature makes difference to the score?

rsrudd · October 5, 2022, 8:48pm

Hello @Nilou,

I would suggest using a feature selection loop. Im also curious what dataset you are using and if you have done any preprocessing to identify and remove correlated/constant features. The feature selection loops can sometimes take a long time when used properly so another recommendation is to use a Regression Tree model in KNIME as it will automatically compute feature importance for you. Here is a workflow illustrating how one can use a regression tree to find feature importance: (Attached is the workflow)
Feature_Selection_Using_Regression_Forest.knar.knwf (51.8 KB)
This workflow uses the following strategies:

Using a try-catch paradigm to deal with the issue you have of loops breaking because there isn’t sufficient data (it is better to instead to check this beforehand and remove such issues).
Using a noise column to assess which columns perform as good as chance (i.e., don’t add much to our predictions).
Collecting each loop within a group labeled by it’s drug. Notice the drug name and dose come first to mark each group, then the R^2, etc. scores, and finally a list of features by importance.

I don’t believe we have any dedicated components for feature importance for regression models.

Hopefully this is helpful,
Regards,
Ryan

system · January 3, 2023, 8:49pm

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.