Document Vector node Problem

SGK · April 23, 2017, 11:37am

Hello,

i m using sentiment analysis on faculty evaluation dataset,after creating document vector , i m getting different number of categories then original

attached file shows , occurences of categories in original(left) and after doc vector (right)

i am not sure where my categories are missing.

i m using exaclt same workflow given by sentiment classfication in knime .

Regards

doc_vector_problem.png

SGK · April 25, 2017, 6:27am

hello,

isnt there any one to ans this bug;

kilian.thiel · April 27, 2017, 11:16am

Hi SGK,

I don't fully understand the problem but I guess that after preprocessing some categories contain less documents than before. Is that your problem?

This can happen is all terms of some documents are filtered out due to filtering steps in preprocessing. For doucments with 0 terms there will be no row in the bag of words. This means that there will be no document vector row for these documents.

Cheers, Kilian