Multi-step rule application

Hi all,

I’m hoping to get some guidance on complex solution that I am working on.

I have two data sources - one table (lets call it TableA) with >700 columns that contains my data and a second table (lets call it MappingTable) that contains a list of rules/results that need to be applied to the data.

TableA will remain the same structure, same columns but the data in it will change.

MappingTable will contain a series of rules that need to be applied to the data in TableA wth the sole purpose of assigning each row in TableA with the correct ‘Expected Value’

Unfortunately, the rules in the mapping table will change on a consistent basis so I am trying to ensure that my workflow is dynamic and doesn’t include any hardcoded values (i.e. so it will still run correctly no matter how many rules or expected results there are to be applied).

I’ve created a standard format for the mapping table (as can be seen below and in attachment) - the reason it’s in this format is for UX purposes when colleagues are filling in the mapping file.

KNIME table example.xlsx (9.3 KB)

A couple of examples:

Row 2

Check if COL1 = the value in the brackets and append a column with the relevant Expected Value.

i.e. if COL1 = “As” then the Expected value is “Result1_Result2”.
if COL1 = “If” then the Expected value is “AnotherResult”

Row 3
more complex as there are two steps that need to be applied to ascertain the correct Expected Value:

Check if COL2 = “T” AND COL3 = the value in the brackets for STEP 2, then append a column with the relevant Expected Value.

i.e. if COL2 = “T” and COL3 = “4S” then the Expected value is “S”.
if COL2 = “T” and COL3 = “P7” then the Expected value is “O”

I’ve processed the data from the mapping file into columnular data as per below for ease processing but I’m struggling to find the correct loop handling to apply STEP1, then STEP 2 (only if needed) and finally apply the Expected Value:
image

Any guidance that you can provide would be greatly appreciated.

Thanks.
tp

if its “simple” checks (equal to, smaller, larger), you can use the rule engine dictionary version
if its one step only, you check against that only, if its two or more steps, you chain them with AND

you just need to convert them into the format supported. but you can add hundreds of rules. I would recommend to make them as singular as possible (below isnt KNIME syntax but pseudo code):

if col1=A AND col2=B => 1
if col1=A AND col2=C => 2
if col1=A AND col2=D => 3
if col1=A AND col2=E => 3

above works and is easier to construct than
if col1=A AND (col2=D OR col2=E) => 3
or a construct with IN
if col1=A AND col2 IN (D, E) => 3

Hi there,

interesting task! Could you provide an example with more results? Especially for Row3 I do not understand the logic.

Andreas

2 Likes

I couldn’t resist and started to play around. But in fact it would be really helpful if you can provide a wider example.

But lets give it a try:

5 Likes

Hi @fe145f9fb2a1f6b,
Thanks very much for this. I can’t believe i’ve not heard of Rule Engine (Dictionary) node before - it’s unbelievably useful :grinning:

I had already built my rules into the correct format for Rule Engine Node - I was creating a Collection column, setting it to a variable and then using the variable as the ‘Rules’ in a standard Rule Engine node.

Unfortunately, this only allowed a one row rule and when i added within a Loop, it was correctly applying each rule but duplicating the line items (expected behaviour).

I’ve tested Rule Engine (Dictionary) for a basic rule and it’s worked successfully. I’ll do some more testing and update.

Thanks again.
tp

2 Likes

Hi @ActionAndi,
Thanks for so much for your reply and taking your time to pull a workflow together. I noticed that you have also used Rule Engine (Dictionary) so I think we are on to something :grinning:

I am very interested to review your workflow and compare it to what I already have - unfortunately, I am only able to use version 5.2.3. and in your workflow the Row Filter and Row Splitter nodes are Missing (due to your newer version).

So that I can rebuild it with my version of the nodes, please can you send me a screenshot of your configuration for these 3 nodes?

image

I’ve also attached an Excel file with some example data.
Example Data.xlsx (12.7 KB)

I’ve added two extra tabs detailing the expected behaviour of each rule.
For Row3 (as per my previous message), the rule steps require us to check for certain values in COL2 and COL3. I’ve highlighted in yellow where each of the values has been matched by a rule.

However, for the Expected value to be applied, both COL2 and COL3 have to be yellow (i.e. matched). I hope that clarifies.

Thanks
tp

1 Like

Just First Row - Filter: keeps the first row as I didn’t understand your other rows:

The unpivot-Nodes creates some rows with missing values which we do not need:

Then I splitted the 2-row table into two one-ros tables :slight_smile:

1 Like

Thank you for replying. I’ll update the workflow accordingly and review it.

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.