I'm all new to KNIME - please, I need help on how to model this:
I have a huge number of observervations and 1-20 observations have the same key in the first column. They have 4 more numeric attributes and one class/target column, which marks the 'winner' for each key.
So for each key there are 1-20 candidates and one of those is marked as a winner. And I need to understand how the 4 numeric attributes influence this 'beeing a winner'.
If I just send this data in a 'decission tree learner' then it doesn't get that there is one winner for each key.
I could take each set of 1-20 candidates and transform it into one long row of data - but this feels pretty ugly. Maybe I could also create a seperate data-table for each key. Is there any better way?
Just off hand I would try something like:
Group Loop Start > Naive Bayes Learner > Pivot > Math Formula > Loop End.
The idea being that you calculate the difference between the means of your winners and loosers (+ some number of SDs). Larger values would be more significant but a rigorous way to choose which features are significant, I would leave up to you.
An alternative option would be to use a similar loop but with single sample t-test where you override your sample value with the value from the winner and the losers as your input table.
That being said, take all of this with a grain of salt as I don't know enough about your application to know whether either of these approaches would be suitable for your needs but hopefully this will at least point you in a useful direction.
thank you!!! I will try this. Actuall this sounds harder than I thought it would be.
So far I was trying a model using a decissiontree or MLP.