I want to apply classification on the result of association rule
machine learning, data mining, nformation retrieval >>>> class1
social network, network security >>> class2
I want to deal with "machine learning" as one word not "machine" & "learning"
I tried the previous solution but it calculates vector for each word separately which is not required
Iwant to calacualte vector for "machine learning" as a whole
How can i do this using bag of words or any substitute node because i tried not to use it but it gives wrong result??
You could try using the Wilcard Tagger node as described here: https://tech.knime.org/forum/knime-textprocessing/conjoined-words
Thank you Ronald
Bag of words using terms instead of individual words works well
but when I apply calssification, The classifier doesn't give me any accurate results (it predicts all the documents with the same class)
what should I do to get accurate results?
Could you please provide some more information on what you are trying to do?
attached are the workapace and the dataset
Iam tyring to calculate the vector for each document.
the doc is treated as a collection of terms or expressions not words
ex. I treat the term "machine learning " as one term not "machine" and "learning"
and after that I can apply classification on the terms
At least two possibilities:
- bag of n-grams;
- replace the space character in between tagged words with "_" or a similar character (which must not be subsequently removed via special character preprocessing, then continue with bag of words;
the problem of this way is that the data I want to classify has not the _ character which will give no result or wrong one
What about tagging the key words during the preprocessing stage ?
I did the tag but wrong results appears
please can you give me steps or workflow or tell me what to do
I made alot of search but i cannot reach what i want
please tell me what to do?
if you have a list of multi words e.g. online learning you can use the Wildcard tagger node to find those multi words and tag them in the documents. Use the setting sentences based matching in the node dialog. After terms have been tagged they are considered as one term.