I'm using KNIME for first time and I have a question about it.
I have an Excel data file. it has 3 columns (Included,sentences,classlabel). sentences column cintains a sentence in any row.
included column contains abstract or introduction value (where sentence appreared in an article) and classlabel column contains 5 category value.
I have to process this file and make a model with decision tree to predict classlabel in a suppervisod way.
So, I have used this nodes one after another: read file, string to document (to transform sentences to documentcells), bag of words creator, punctuation erasure, stop word filter, porter stemmer, case converter, tf, idf, icf, category to class, X-partitioner, decision tree learner, decision tree predictor, X-Aggregator and scorer nodes.
Now, the accuracy is 0.58 and error rates are about 0.40-0.43. How can I increase the accuracy and decrease the error rates?
Can anyone help me please?
you could try to use another learner e.g. Tree Ensembles. How long are these sentences that you want to classify? Do they usually contain words that are discriminative regarding the class? Classification of of only one sentence can be difficult or even not feasable, depending on the setences of course.
Thanks for your guides. I have 1028 sentences with different length, some of them are short and some of them are long.
I,m not sure about discriminative words.
I tried Tree ensemble but when I connected learner input to the X-partitioner, I recieved an error: "No configuration available"!
Can you guide me again?
I have another column (except classlabel column that classifies sentences to 5 category)that classifies sentences to 2 classes.Is there a way to use it also?
can you provide more details about the Tree Ensembles and X Partitioner you are using? Please attach a workflow.
About the additonal column: in which way do you want to use this additional column, as a target column or feature column?