loop over input/column set


I'm trying to build some prediction models using linear regressions and Neural Network and I was wondering if there's a solution to create a loop over input sets.

The idea would be to test every input parameter combinations to define the one that give the most accurate model.


Trying to predict A with B,C and D available, I'd like to build following prediction models

A = f(B), A = f(C), A = f(D), A = f(B,C), A = f(B,D), A = f(C,D), A = f(B,C,D)


It sounds kind of a "brute force" method, but I'm new to Knime and Data Mining in general and I'm kind of trying and looking what is possible. Thus I'm also open for any alternative (nicer) methods

Thanks a lot

you may wish to try Backward Feature Elimination metanode for finding out best combination of variables.

Totally agree with Boraster. The Backward Feature Elimination metanode is what you want.

I used just this for the purpose you describe two days ago. Its a brilliant metanode!

Inside the metanode replace the Learner and Predictor nodes for the Linear Regression versions that you are using.



many thanks for the advice. That worked pretty well.

I'd have one more question regarding this node.

I tried to create a loop to use the Backward Elimination Filter on several sets with different length. (same columns but different number of observations). The "squared error" used by the Backward Feature Elimination being a sum of the squared errors, its value depends highly on the set length and it makes it difficult to define a suitable automatic threshold for such a loop.

Is there a way to either replace the "squared error" and use for example the mean squared error or to define a threshold, that is dependent on the length of the data set ?