How to create multiple labels

Safak · November 16, 2021, 2:48pm

Hi There,
I try to categorize some root causes based on search keys.

This is the dictionary, The first Colon is the key, the second colon is the category;

this is the workflow structure;

I used Rule Engine for this logic.
The problem is ; if multiple keys are active Rule engine only returns one of them.
I would be better off if I can get all the triggering keys and the addressed categories if multiple root causes are present. ( output as list or any other way)

is there any other way around?

SamirAbida · November 16, 2021, 3:31pm

Hello @Safak,

In that case, you should use the “Joiner” node. It will replicate as many values as there are (which can cause duplicates for others) but will work for you.

BR,

Samir

Safak · November 16, 2021, 4:20pm

Hi Samir,
Thanks for fast reply.
I could not understand how joiner node can help in this situation;

this is the sentence I try to categorize ;
" gasket does not fit in and it always pops out "
here are two keys for me;

fit in & pop
I should be able to locate each one of them in the sentence ( like row filter with wild cards)
and get the keys and corresponding categories;
Fit in => gasket dislocated
pop => gasket dislocated

The rule engine works fine but it only returns the last one.
I need all the keys and categories that are present in the same sentence.

takbb · November 16, 2021, 5:48pm

Hi @Safak , this isn’t that easy to answer for a couple of reasons. Firstly,I’ve had to just make up some data, and so maybe this will be nothing like the form that your data takes. Also from the example you’ve given, you would expect “pops out” to match with “pop” which means that it would maybe also match “population” with “pop” too, which probably isn’t what you’d want.

There may be some nodes that are better suited to this task for stripping phrases out but I’ve gone for “simple” at least as a starting point…

Here is my sample data:

and here is a list of “phrases” and “categories”

This is my workflow

What it does is place every phrase/category row alongside every description row using a cross joiner. It then uses a Rule Engine to pattern match the existence of the phrase, inside the description, and if it is present, outputs the category:

Row filtering to remove where no SelectedCategory was returned, and then removing duplicates and finally grouping back into single rows, plus left-joining back onto the original table to retain anything for which no categories matched, we get this:

I don’t know if any of that can be worked into your own solution, but maybe it can give some ideas. There are plenty of holes in my approach, not least being that it doesn’t necessarily match entire phrases respecting word-boundaries, or word-stems, but perhaps its a start?

Keyword Categories.knwf (25.8 KB)

This could be adjusted to use regex instead of wildcards. Replace the text of the String Manipulation node to:
join(".*\\b",$phrase$,"\\b.*")

and then replace the text of the Rule Engine to:
$description$ MATCHES $MatchPhrase$ => $category$

and this would then ensure that phrases matched on word-boundaries, which possibly slightly improves its functionality.

Safak · November 17, 2021, 5:15am

Hi @takbb,
That’s an interesting angle, I will try to implement it. Actually, I haven’t used cross Joiner before but it looks handy.
Thanks for the time and effort you put in. My actual list of phrases is quite large and the list of keywords are getting crowdier each day so I guess it will take a while to process the workflow but Hey There’s no free lunch

Thanks for the great start I’ll improve on it.

takbb · November 17, 2021, 7:14am

Hi @Safak, thank you for the compliments and I hope it works for you, or maybe somebody else will have a better alternative.

Yes Cross Joiner can be useful, but certainly with long lists it can soon produce a very large list as it is the number of rows in one table multiplied by the number of rows in the other so it will be interesting to see how well it works. With no looping nodes though it should at least be able to process the list reasonably efficiently. If you get something running I would be interested to know how well it works.

Feel free to post back with a demo as maybe we can think of some optimisations or other improvements. Good luck!

system · November 24, 2021, 7:15am

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.