Associating words with part numbers

Disclaimer: I have known and started learning Knime for less than a week; so forgive me if this is a noob question.

Problem: I have a table with that unique identifers (part numbers), part description, and other information (eg. date, quantity bought, etc). The "part description" column contains free text that tries to describe the item that "part number" really is. I'm trying to use keyword(s) in the part description column to match part numbers at a high confidence level. Eg. if items with part number 100001 has "wheel" in the part description 95% of the time then I can confidently say that 100001 is a wheel. If items with part number 100003 has "pump" in the part description column only 20% of the time, it won't be wise for me to categorize those items as pumps. Got the idea? I think I'm trying to do something like this:

My input is an oracle table which I"ve been able to read into Knime. The youtube video I'm trying to follow seem to continue from a previous video that I cannot find. But I think his "positive" and "negative" categories will be my unique identifiers (part numbers). I don't know how he configured the connection of nodes at the top (first row). 

Can you guys help me with to solve this problem? A step by step node connection and configuration will be very helpful. The words in "part description" column are normally between 2 and 5 words. I'll like consider each line as one term and not break them up. Eg. I'll like to keep "steering wheel" as one term and not split them into "steering" and "wheel". Please help. Thanks.

Hi okyere,

how many unique numbers like 100003  do you have?

The first row of the workflow in the linked video is a basic chain of textprocessing nodes. Documents are red/parsed or created and peprocessed by various preprocessing node. For information about the textprocessing nodes please see:

here you can find a online documentation:

and here example workflows:

Cheers, Kilian

Hey Kilan,

Thanks for your comment. I have about 50,000 unique part numbers in the data set. As I stated, we don't have a formal item description; and that's why I'm trying to use the information in the "part description" colum to kind of group them. I will be using the links in your post as a study guide. Thanks


Going through the documentation and examples of text processing, I have not seen anything that explains how I would tackle my problem. Most of the example are getting data from two sources and read in as one column into Knime. I have two cloumns that I'm trying to analyze how the words in one column describe some unique identied part numbers in another column. Some direction will help. Thanks