How to use Feature Selection

sireeshapulipati · February 21, 2012, 7:20am

I'm a knime beginner and am not able to figure out how to use the Feature Selection meta nodes.

can anyone please give me the structure - how to use those nodes (in what order, how to configure etc)?

I'm trying to use it for classification using Logistic Regression

thor · February 21, 2012, 12:29pm

If you drag&drop the feature elimination metanode from the repository into your workflow it is already configured for Naive Bayes. First you need to replace the learner and predictor with the ones for logistic regression. Then, connect the training data to the first input port of the meta node and the dataset from which you want to filter out features in the end to the second input port. If you execute the loop inside the metanode it will create a feature elimination model and once the loop is finished you can select a set of features in filter node.

richards99 · February 21, 2012, 12:43pm

There is a workflow to show you how to use this.

Goto the KNIME Example Workflow Server panel on the right and connect to the server. Workflow in 002_DataMining called Feature Elimination may help.

Simon.

sireeshapulipati · February 21, 2012, 3:19pm

Thanks Thor and Richards.

Excuse me for my ignorance, but is this meta-node used for dimension reduction for improving the accuracy of the predictions?

I'm trying to do the following:

I've classified a dataset using Logistic Regression taking all the columns and got some accuracy. And now, I want to perform dimension reduction on the dataset (so that I predict using only a subset of columns) using PCA or something.

Does the Feature Elimination node enable me to do the same?

I followed your instructions and gave the training data and test data to the two input ports. But what to connect to the output ports?

thor · February 22, 2012, 3:37pm

Feature Elimination creates you a list of features with the corresponding prediction errors. In the filter you can select a set of features.

The first output of the meta node is just a table containing the different feature sets and their error rates. The second output is the same as the second input table but only with the features/columns that you selected in the filter node. This is the dataset you want to use furtheron.

Gabriel_Cornejo · March 7, 2015, 4:20am

Hello everyone:
I have carefully read what I have written about the method of removing attributes. I got out of the examples and I've put in a workflow, but there is something I do not understand how I make the selection of attributes is automatic? I put a threshold, but do not know which figure is correct for this threshold, is can you help me?

thank you very much.

Gabriel

ipazin · December 5, 2019, 12:41pm

A post was split to a new topic: Feature Selection Start/End workflow examples