BoW Creator / 3rd column?

Hi,

I would like to know if it is possible to add a 3rd column (e.g. the "Title" of a document by deploying a "Document Data Extractor" node before) to the BoW creator.

The description of the BoW creator includes "A BoW consists of AT LEAST two columns, ..."

But when I add a string or a term column or both inside a column filter node (a 2nd document column is not allowed) to be included as from now the BoW creator always displays only 2 columns (the bag of words and the document).

Thanks,
Werner

Hi Werner,

the output data table of th BoW creator node contains exactly two columns, the document and the term column. If You want to extract the original title after certain preprocessing steps use the Option "Apply unchanged documents" available in the dialogs of all preprocessing nodes. This option is set true by default, anyway. After creating a bag of words and preprocessing it, the original document are still available. By using the Document Data Extractor You can then extract the title afterwards.If it is extracted before the BoW node, this node (and all preprocessing nodes) will cut off this column.

Cheers,

Kilian

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.