I'm using KNIME for first time and I have a question about it.
I have an Excel data file. it has 3 columns (Included,sentences,classlabel). sentences column cintains a sentence in any row.
included column contains abstract or introduction value (where sentence appreared in an article) and classlabel column contains 5 category value.
I have to process this file and make a model with decision tree to predict classlabel in a suppervisod way.
So, I have used this nodes one after another: read file, string to document (to transform sentences to documentcells), bag of words creator, punctuation erasure, stop word filter, porter stemmer, case converter, tf, idf, icf, category to class, X-partitioner, decision tree learner, decision tree predictor, X-Aggregator and scorer nodes.
Now, the accuracy is 0.58 and error rates are about 0.40-0.43. How can I increase the accuracy and decrease the error rates?
you could try to use another learner e.g. Tree Ensembles. How long are these sentences that you want to classify? Do they usually contain words that are discriminative regarding the class? Classification of of only one sentence can be difficult or even not feasable, depending on the setences of course.
I have another column (except classlabel column that classifies sentences to 5 category)that classifies sentences to 2 classes.Is there a way to use it also?