Check content and apend column

Kazimierz · July 6, 2023, 12:16pm

Hello KNIMErs,

My input data is simple: ID and input columns.
Input column contains strings separated with ‘/’. Number of separators varies between rows and input datasets. Strings ‘abc’ and ‘xyz’ should be treated as predefined list of important strings and the list should be easy to identify and update in the workflow.

Outcome should be the new column with one of four entries:

ABC for input containing one or more ‘abc’
XYZ for input containing one or more ‘xyz’
ABC, XYZ for input containing one or more ‘abc’ and one or more ‘xyz’
empty field for other cases.

This is sample dataset containing input and output data:

ID	Input	Expected output
1a	dhj / xyz / fri	XYZ
2b	akr / sdj
3c	fre
4d	jft / kyo / xyz / ccc	XYZ
5e	lws / abc / djf / mew / yqa	ABC
6f	ugh / xyz / abc	ABC, XYZ

I’ve thought this should be quite easy to do, however I’m still searching for a solution. So far, I’ve tried loop, string manipulation (multiple column), cell splitter with no success.
Would you be able to support?

Thank you in advance,
Kaz

denisfi · July 6, 2023, 12:37pm

Hi @Kazimierz ,

You can use rule engine node to test the task and put the result as variable.

I saw from the example that the “abc” and “xyz” can be at the same row and it important that to be in order right? are they only this condition or more options to? or just to put it into order?

Kazimierz · July 6, 2023, 12:48pm

Hi @denisfi

You are right, rule engine makes the job. The only disadventage is number of expressions. In case of 2 strings ‘abc’ and ‘xyz’ it is easy to manage limited number of expressions. However, the full list of strings searched might be dozens. Therefore, I would prefer to use ‘Table Creator’ to store the list of predefined important strings.

Thank you very much for your suggestion,
Kaz

ipazin · July 6, 2023, 1:32pm

Hello @Kazimierz,

you can make use of Subset Matcher node. Take a look here:

Br,
Ivan

Kazimierz · July 6, 2023, 1:53pm

Hello @ipazin

Thank you for your suggestion. Let me give it a try.
Best regards,
Kaz

Kazimierz · July 10, 2023, 2:16pm

Hi @denisfi and @ipazin

Your suggestions haven’t worked for me. Nevertheless, thank you both for your time.
Special ‘thank you’ goes to @denisfi for inspiring me to change my approach. Finally, I have found a solution to my problem and the solution refers to analyzing the entire content of the column/cell without caring about potential separator used there.

Happy KNIMEing,
Kaz

system · October 8, 2023, 2:17pm

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.