In regression, select input columns using a list

Hi Folks,

Trying to understand how to dynamically select the input columns in the predictive tool. Everytime the workflow runs, the target variable will be the same so I can just select that. On each run, the columns I want to use will constantly be changing. I have the list of column name to use being read in each time (in a column called ‘var’). I’m trying to convert this to a variable to pass to the learner so it strictly only uses columns in that list. So far no joy. Any ideas on how to achieve this?

Thanks

@shughes2026 you can use a Regex pattern and feed that to the selection of the column sin the Regression learner or any other place you want to select specific columns.

^(education-num|marital-status|occupation|relationship|race|sex|hours-per-week)$

3 Likes

@mlauber71’s approach is good. Just to prove in Knime there are usually multiple ways to accomplish something, here’s an alternative. Its slightly less labor intensive since it doesn’t require construction of the regex. The workflow is just a stub to show how to pass the selected columns as flow variables.
Linear Regression Learner w Flow Variables.knwf (71.7 KB)


4 Likes

Thanks both - really appreciated. Decided to park the dark arts of regex for another day, a bit too much for me :wink: The node based workflow baffled me as its the same way I’d built mine. However, on digging into your example, I didn’t realise I needed to group them as a ‘set’ , I was just passing the raw values through and it was picking the first one. Really really appreciate the help - it’s dug me out of a hole. Thanks!

2 Likes

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.