Rule engine

Hello,

can someone help me with the following issue?

I think I have to use the node “rule engine” but I am not sure about the syntax.

I would like to follow the following rules to fill column “Projekt”

  1. Take the first 6 digits from column “PSP-Element”
  2. If column “PSP-Element” starts with “L” then take the first 12 digits
  3. If there are no values in column “PSP-Element” take the value from column “Kostenstelle”

image

Thank you for your answers :slight_smile:

Hi @MrKatze & welcome to the KNIME community

The problem with the -Rule Engine- node is that you cannot have as result an operation on your data and unfortunately this is what you need in your 1st and 2nd conditions.

For this reason, you would better go for a -Column Expressions- node where you can “append an arbitrary number of columns or modify existing columns using expressions” as explained in the node help:

Otherwise yo could do it using a -String Manipulation- node but it is trickier:

Could you please post your example as text data so that we can have a go at it and provide you with workflow solutions ?

Thanks & regards,

Ael

2 Likes

Hi Ael,

thank you for your quick answer.

Please find some dummy data in the txt attached.

test_data.txt (121 Bytes)

BR
Hans

Hallo @MrKatze

My pleasure. Thanks for the data.

Please find below a solution based on the -String Manipulation- node:

join(
substr($PSP-Element$, 0, 6),
substr($PSP-Element$, 6, 6 * (indexOfChars($PSP-Element$, "L") + 1)),
substr($Kostenstelle$, length($PSP-Element$), length($Kostenstelle$))
)

20220308 Pikairos Rule Engine.knwf (36.4 KB)

The first “substr()” function extracts the 1st 6 characters (in red).
The 2nd “substr()” function extracts the 7th to 12th characters (in green) only if the 1st letter is a “L”.
The 3nd “substr()” function extracts all the characters from column “Kostenstelle” (in purple) if column “PSP-Element” is empty.

All this sentences are concatenated using a “join()” function, which eventually provides the expected result.

I recognize this is not the most intuitive solution but it could be split into several -string manipulation- nodes and then use a -Rule Engine- node to make it simpler to understand.

Hope it helps.

Best

Ael

3 Likes

Here is my own solution making use of a little Java Script in the Column Expression node - it’s my first time writing java script and it wasn’t too bad haha. My output is strings of numbers without the A, L, etc.

Series_of_Rules.knar.knwf (14.3 KB)
Screen Shot 2022-03-08 at 12.21.00 PM


Small edit: just realized you can delete the second rule engine as I don’t use it.

4 Likes

Hi @aworker , I think there might be a flaw in your logic :slight_smile:

This is about this:
substr($PSP-Element$, 6, 6 * (indexOfChars($PSP-Element$, "L") + 1))

You will get the expected results only if there is no L, or it’s at the first position. Not seeing all the data, it cannot be guaranteed that it will work for all the data.

For example, if I have these values, you won’t get the expected results:

PSP-Element
A.L677-01
A.77L36-01
F5454L12345

Basically anything with L but not at the first position will not work.

Of course, I don’t know if these data can even occur, but if they do, I don’t think the logic will work.

To enforce substr($PSP-Element$, 6, 6 * (indexOfChars($PSP-Element$, "L") + 1)) to apply only if L is at the beginning, you could modify it to:
substr($PSP-Element$, 6, 6 * (indexOfChars(substr($PSP-Element$, 0, 1), "L") + 1))

I haven’t tested any of this, so I could be wrong all the way :rofl: . But executing this in my head, that’s what I think will happen.

But brilliant way to apply some conditions indirectly

2 Likes

Hi @bruno29a

You are right. Thanks for spotting this. I have modified the string function according to that. I’m posting it here to provide the final right solution:

20220309 Pikairos Rule Engine.knwf (37.6 KB)

Best

Ael

3 Likes

Thank you again!

Could you maybe have a look at the following:

I would like to add now one more condition:

If PSP-Element is empty & Kostenstelle is empty, then there should be the first 6 digits of “Text” in the column “Projekt”, but only if there is a “.” at the second digit of column “Text”

image

Is that somehow possible? Thank you!!

Hans

Hi @MrKatze

My pleasure. Could you please post this new version of your data in csv format so that it is easier to test and provide you with a solution ?

Thanks
Ael

test_data.txt (582 Bytes)

Thank you!

The data you uploaded is not the same as the one in your snapshot. Besides this, the result in your snapshot does not fit the last rule you wanted to add.

Could you please clarify ?

image

yes i modified the data a bit, in order to have for every party of the formula an example.

But the data should be right like this. I also added the desired result in column “Projekt”

So basically:

  1. Take the first 6 digits of column “PSP-Element”,

  2. If “PSP-Element” starts with “L”, take the first 12 digits

  3. If “PSP-Element” is empty, take the value out of “Kostenstelle”

  4. If “PSP-Element” and “Kostenstelle” ist empty, take the first six digits from column “Text”, but only if the second digit of “Text” is a “.”, if there is no “.” at the second digit, then leave it empty. (Like in the Screenehot)

So Point 1-3 is covered with

join(
substr($PSP-Element$, 0, 6),
substr($PSP-Element$, 6, 6 * (indexOfChars(substr($PSP-Element$, 0, 1), “L”) + 1)),
substr($Kostenstelle$, length($PSP-Element$), length($Kostenstelle$))
)

But Point 4, I would like to add.
Hope that clarified a bit!
Thanks again!

Hi @MrKatze , you might need to use the Column Expression.

The String Manipulation is not really made for that, but with the original data and rule, @aworker was able to make it work with some creativity. And may be it can still be done, but from what I see, the difference for Text is that it has data no matter what.

This part worked substr($Kostenstelle$, length($PSP-Element$), length($Kostenstelle$)) because $Kostenstelle$ seems to not have any value when $PSP-Element$ has value. On the other hand, this is not the case with the column Text.

I can’t see how this can be done here. It’s essentially a do-over in a Column Expression.

Basically you wanted to go to the next city, so @aworker built you a car. And now you are asking if you can go to the other country using the car. Not if it’s on the other side of the ocean.

2 Likes

Hi @MrKatze

Thanks for the clarifications in your last post.

I fully agree with @bruno29a. The -String Manipulation- node is good for small direct “string manipulations” but alas it lacks of an “ifThenElse()” function which would have made of it a much more powerful node. In some cases this latter function can be simulated as I did in my example but one should not force to much the exercise since the global final function can easily become very complex to understand and not worthwhile the effort as in this case.

I would hence suggest to go for the -Column Expressions- alternative which has a proper “ifThenElse()” function plus all the capabilities of a true scripting language.

Hope it helps.

Best

Ael

Ps: @bruno29a you are much more skilled than me with the -Column Expressions- node so I leave you to show the way lol :wink:

@aworker I’m not sure I’m more skilled than you with the Column Expressions. I’m sure you do well with it :slight_smile: But if you insist.

@MrKatze just to clarify your rules: I think by “digit” you mean character, and that’s based on your screenshot and also “.” is not a digit. So, I’ve created something that will take the first 6 or 12 characters not digits.

Workflow looks like this:
image

Input data:
image
Note: I’m not sure why your input file is encoded in UTF-16. It’s easily fixed by enforcing the encoding.

Results:
image

This looks the same as your expected results.

These are the rules that were implemented in the Column Expression:

if(isMissing(column("PSP-Element")) && isMissing(column("Kostenstelle")) && substr(column("Text"), 1, 1) == ".") {
    substr(column("Text"), 0, 6)
} else if(isMissing(column("PSP-Element"))) {
    column("Kostenstelle")
} else if(substr(column("PSP-Element"), 0, 1) == "L") {
    substr(column("PSP-Element"), 0, 12)
} else {
    substr(column("PSP-Element"), 0, 6)
}

They have to be implemented in that order so that they do not interfere with each other. Coincidently, they’re exactly in the reverse order of how the rules were given :rofl:

Here’s the workflow: Set values based on existing values.knwf (8.7 KB)

3 Likes

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.