Check Keyword of Text

Hi, Everyone. I’am Nice

          My Case is >> I want to check the keyword in my massage. But in one massage need to have many keywords and in the same keyword can have different Text.

Example: Massage is: “I like the doll have pink color hair. "
Keywords are: “like”, “doll”, “pink”, “hair”
***But in the 1 keywords can be like this also: “like”, ilke”, “Likeeee” , “Likkeee” but all meaning = Like
***And in 1 massage need to have all these keywords ( “like”, “doll”, “pink”, “hair” )

How of solution can help me to check data all these?

Thank a lot for guide me.

Hi @Nice and welcome to the KNIME forum,

1- You can try some modification first, e.g. replacing more than 3 identical consecutive letters with 2 in the String Manipulation node:

regexReplace($column1$, "([a-zA-Z])\\1{2,}", "$1$1")

Then applying a spell check using the Spell Checker (Simple) node. You can download the N-Gram data from link provided in node’s description if you want to have more accurate results.

2- Convert to document and do some preprocessing like remove stop words or a list of your own and also words with less than 3 characters, then create bag of words, convert to string, create a set and finally the Subset Matcher node.

Here is a simple example workflow:

message_match.knwf (104.9 KB)

:blush:

4 Likes

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.