bug in Feature Selection Loop Start

Hi I noticed a bug when you use the Wildcard/Regex Selection in the Feature Selection Loop Start 1:1 node, as opposed to Manual Selection, and have the Backward Feature Elimination strategy selected. When it runs through the loop with the Feature Selection Loop End, the number of columns (which you added to the right panel based on RegEx in the Loop Start node) does not decrease upon each loop iteration.
Thanks,
Simon.

Hi @richards99,

Thank you for sharing your feedback.

May I ask which verison you are using?
I tried the backward feature elimination in feature selection loop with the latest AP version (5.2.3) and does work. For me, the first iteration starts with all the included columns, and then it eliminates 1 column in the next iterations, then 2 columns and so on.
Would you please explain the behaviour you are observing?
If you have more than 1 column included and no threshold defined, the second iteration should have 1 column fewer than the first iteration. If that’s not the case for you, please let me know.

Interesting, this is on 5.2.3, on MacOS
I have just tried this again and cannot repeat the behaviour I saw earlier.
I will have a play around again today, to see if I can reproduce this.

1 Like

Aha, just managed to reproduce it.
So there a few ways this bug comes up, but this is one method I can nail down to reproduce it every time.
Set up the node for Backward Feature elimination, and set up the downstream model like a Linear Regression Learner and Predictor with the target column and variable columns.
Now in the Loop Start if you set it up with RegEx for filtering for you desired columns on the right, i.e. ABC.+ and run it, it will run fine. Now if you go back to the Loop Start node, choose Manual columns, and add columns to the right including your target column and run it, it will obviously fail on the Learner node at some point as the target column is removed. Now go back to the Loop Start node, flip on the Regex column filtering, and run it now. You will now find it does not filter out the columns one by one.

1 Like

Dear @richards99,

I can now reproduce the issue and created a ticket for it: AP-22279

Thank you for reporting this and providing the clear instructions to replicate the problem.

2 Likes