term features for Bayesian Learner

I have a training dataset for Search Data that looks something like:

Phrase Value1 Value 2 Value 3 Classification
abc 100 50 17 Brand Term
abc def 99 99 10 Category Term

There are a mix of numerical values that are associated with the delivery of the Search phrase (cost, position, click rate, etc) as well as the phrase itself.  I have manually classified 2% of terms into either Brand Term or Category Term buckets and want to build a Learner to classify the other 98%.

I would like to use a combination of terms in Phrase as well as Values to build a classifier to most accurately predict outcome.  For instance when Brand name is in Phrase it is always classified as Brand Term so seems this would be valuable feature to carry over.

My problem is I am not sure how to conver the "Phrase" column into a feature set of the Bayesian Learner.  I would assume I need to create either a Bit or Term Vector or an individual column per unique term within Phrase (example below):

Phrase Value1 Value 2 Value 3 abc def Classification
abc 100 50 17 1 0 Brand Term
abc def 99 99 10 1 1 Category Term
             

Any good examples of doing something like this either via Text Processing nodes or other KNIME nodes?

This approach seems to work but happy to hear if someone knows of a better methodology:

  1. Strings to Document (on phrase), set Title to ID, Content to phrase
  2. Bag of Words Creator (on document)
  3. Document Data Extractor (convert Title/ID back to String column)
  4. Term ot String (convert individual Terms to String column)
  5. Rule-based Row Filter (remove "extra" rows where Title = Term)
  6. Column Filter (to just String Title and Term columns)
  7. Constant Value Column ( =1 for next Pivot step)
  8. Pivot ( group = Title, pivot = Term, agg = first of constant column )
  9. Missing Value ( = 0)
  10. Create Bit Vector (from Int columns)
  11. Column Filter (to just Title and Bit Vector)
  12. Joiner (back on initial data set by Title/ID)

Result is initial data set now with appended Bit Vector representing the term space.  The Naive Bayes Learner seems happy with this input (assuming PMML compatibility is turned off).