Prevent BoW from using document title for terms.

When I run my documents through BoW, it processes the titles.  As I am analyzing emails, this is rather inappropriate.

Is there a way to prevent this other than just assigning "title" to a meaningless column and filtering it with Regex later?

Words in the title are processed as well. This can not be avoided. You could use a string column containing e.g. numbers as title column for the Strings to document node and filter these numbers out later on with the number filter.

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.