i have one a dataset with fix row (56 rows) and a second dataset with 36000 rows.
I want to calculate the number of occurrences by taking each record from dataset 1 and project it in a new column in dataset 2. exemple like this
dataset 1
keyword
account hijacking
accumulo
acoustic cryptanalysis
active cyber defense
active defense
advanced encryption standard (aes)
advanced evasion technique
advanced persistent threat (apt)
dataset 2 shoud be like this
keyword
sum
azaezaeaazerazraz account hijacking
azerazerazeazer accumulo
azeazerazer az r eaze acoustic cryptanalysis accumulo
It loops every keyword on the table with the sentences to check. I didn’t calculate a sum yet, because a sentence can match multiple keywords (add a Column Aggregator if you want to sum up).
In addition to the solution by @HansS, I have prepared another one which I think covers more different cases. But first, if you are going to use @HansS’s solution, you have to modify the expression in the Rule Engine to this:
$new column$ >= 0 =>1
TRUE => 0
The difference is the equal sign (=) in the first line ($new column$ > 0 =>1 to $new column$ >= 0 =>1).
Otherwise, with the current expression, if a keyword is placed at the beginning of the string, it would not be counted.
I thought the keywords may appear several times in a string, but the solution by @HansS, just tells you if a keyword exists in a string or not. Here is the workflow to count the number of each keyword in each string: (I have modified your 2nd example table to consider more different cases)