Web-user search term categorization in KNIME


I am planning to classify web-user search terms from a shopping platform into product categories using preferably NLP in KNIME. I have a dataset (excel/csv) which contains all the search terms that users enter in the internal search and I want the system to learn to categorize the keywords as following:

Search term                   Product  Category
-Laptop                           -Electronics
-Butter                            -Food & Beverages
-Exercise book               -Office Supplies

The search-terms are in the German language and range from one word to one line.

So my question here is: Which approach is best to solve this task in KNIME? and which nodes can I specifically use?



usually you would classify documents instead of terms. For classification you need to first build a numerical feature vector out of documents or terms. For documents you can use the Document Vector node. To create vectors for terms you can use the Term Vector node. Be aware that this is simply the transposed document term matrix. This means your features for terms will be the documents that contain them. Based on these vectors and on a labeled training set you can build a classifier.

I hope this helps.

Cheers, Kilian