Hi Folks, I am struggling to get a regression to work with Knime.
I have a certain data table that contains 10 columns of input data (some are text, some are numbers) and the expected output are other 6 columns.
The catch is that these 6 columns have to to sum up to 100% for each row.
I am basically allocating a certain spend across 6 different buckets based on 10 input variables. What I want to achieve is a model that can predict this percentage split of new data items based on training on the dataset that I have with historical allocations.
Can you point me in the right direction? All the examples I could find only predict one column...and they don't mix text and numbers in the input columns.
Any help is appreciated !
here you can find an example workflow where I implemented a multitarget prediction.
Kindest regards, Iris
Any idea on how I could ensure that the predicted rows sum 100%? This is a premise of the model.
Not really a good one...
You could set the last value to 100-the others...
Ok, that's not good news :-(
One other thing... one of the columns in my regression (and classifications, for that matter) are strings, free text strings. I would like to use the information in the text to help my regression performance, alongside the other numerical values.
For instance: my brain know that there's a high chance that the class for a given row will be "A" if the person wrote "ipsum loren" somewhere in the text.
Is there any way that my trees and neural networks could try to figure that out too? (while still using the other numeric inputs)