Check content and apend column

Hello KNIMErs,

My input data is simple: ID and input columns.
Input column contains strings separated with ‘/’. Number of separators varies between rows and input datasets. Strings ‘abc’ and ‘xyz’ should be treated as predefined list of important strings and the list should be easy to identify and update in the workflow.

Outcome should be the new column with one of four entries:

  • ABC for input containing one or more ‘abc’
  • XYZ for input containing one or more ‘xyz’
  • ABC, XYZ for input containing one or more ‘abc’ and one or more ‘xyz’
  • empty field for other cases.

This is sample dataset containing input and output data:

ID Input Expected output
1a dhj / xyz / fri XYZ
2b akr / sdj
3c fre
4d jft / kyo / xyz / ccc XYZ
5e lws / abc / djf / mew / yqa ABC
6f ugh / xyz / abc ABC, XYZ

I’ve thought this should be quite easy to do, however I’m still searching for a solution. So far, I’ve tried loop, string manipulation (multiple column), cell splitter with no success.
Would you be able to support?

Thank you in advance,
Kaz

Hi @Kazimierz ,

You can use rule engine node to test the task and put the result as variable.

I saw from the example that the “abc” and “xyz” can be at the same row and it important that to be in order right? are they only this condition or more options to? or just to put it into order?

1 Like

Hi @denisfi

You are right, rule engine makes the job. The only disadventage is number of expressions. In case of 2 strings ‘abc’ and ‘xyz’ it is easy to manage limited number of expressions. However, the full list of strings searched might be dozens. Therefore, I would prefer to use ‘Table Creator’ to store the list of predefined important strings.

Thank you very much for your suggestion,
Kaz

1 Like

Hello @Kazimierz,

you can make use of Subset Matcher node. Take a look here:

Br,
Ivan

1 Like

Hello @ipazin

Thank you for your suggestion. Let me give it a try.
Best regards,
Kaz

Hi @denisfi and @ipazin

Your suggestions haven’t worked for me. Nevertheless, thank you both for your time.
Special ‘thank you’ goes to @denisfi for inspiring me to change my approach. Finally, I have found a solution to my problem and the solution refers to analyzing the entire content of the column/cell without caring about potential separator used there.

Happy KNIMEing,
Kaz

3 Likes

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.