I reproduced the sentiment classification example with a slight change in the model. I used bagging metanode with decision tree algorithm and everything seems perfect. The accuracy rate is close to %100, but I would like to know how to classify real data without a sentiment column.
Whenever I try to add just a user comment without sentiment column, model makes a wrong prediction. On the other hand, with sentiment column there is always %100 accuracy in prediction.
Maybe I am doing something wrong because it's been just about 1 month since I started using Knime and getting into data analysis world.
Any ideas and suggestions will be highly appreciated.
Of course the prediction is perfect. The learner adapts to the data, and the data contains one column that predicts the sentiment perfectly - because it is the sentiment! So if you strip it off, you can start getting real results. Which, as you have noted, may be depressingly bad in real-world problems, especially at the beginning.
Then, teaching and then testing the model on the same data set with partioning is actually not healthy at all????
It is. As long as your learner knows what is data and what the expected result.
You never learn and test on the same data set. This would be learning by heart. And you cannot built a predictor without an target column (which is the sentiment here), otherwise it is an unsupervised learner. However including the sentiment in the unseen data table would not change the result for the predicton, as it would not be included from the predictor.
It seems, you configured the Decision Tree Learner wrong. The Target must be the sentiment, than you can predict unseen data.
PS: I moved your topic to the textprocessing forum