Remove strings from one column that are contained in another column

Hi,
I wanna remove substrings from the first column that are in second column.
Example:

I have a table that looks like this:

| I have a cat | a cat |
| There is a dog | Wanna
| Wanna dog and a cat | have

As a result my table would be like:


| I have | a cat |
| There is a dog | Wanna
| dog and a cat | have

It should search for all matches of a values that are in second column and replace them.

Hi @mkubita , and welcome to the forum!
If I understand you correctly, you wish to search the string value in the first column for the string value in the second column and wherever found, remove the matching string from the first column?
If so, the 3rd row of your example result table should be:
| Wanna dog and a cat | have
Is that correct?

In any case, this could be accomplished several ways. Perhaps the most straightforward would be to use the String Manipulation node. You could select the option to replace the value in the first column, and enter the expression as in the screenshot below.

If instead you want to use the value from the second column that is in the previous row, you could use precede the String Manipulation node with the Lag column node. I hope this helps!
-Don

Hi,
thanks for the answer. I did not explain it correctly, so I may clarify my question.

I have two columns in my dataset and the second column contains the strings that should be replaced in the first column - but the thing is that that the order of rows does not matter.

It means that if I have in me second column strings like that: dog / cat / horse and in my first column I have phrases like: dog likes cat / I like you. Firsly it should search for ‘dog’ in first column and remove this strings, then it should search for ‘cat’ and remove it, then for ‘horse’. As a result my dataset should look like: likes / I like you

Hi @mkubita

I don’t understand the logic.

row1 → I have a cat => expected result: a cat
but in row 3, column2 there is: have => I would expect for row 1 just: I

result: =>
row 1 → I
row 2 → there is dog
row 3 → dog and

gr. Hans

1 Like

Yes, you are correct @HansS. I made mistake

1 Like

So, do you know how to do this?

I will clarify, because I made mistakes.

I have a table that looks like this:

| I have a cat | a cat |
| There is a dog | Wanna
| Wanna dog and a cat | have

As a result my table would be like:


| I | a cat |
| There is a dog | Wanna
| dog and | have

Hi @mkubita

Yes I have a solution. I used the KNIME Text Processing nodes for this, see
KNIME_project3.knwf (37.1 KB)


I created a custom StopWord list from column2 to filter the words in column1.
Schermafdruk van 2023-05-31 09-58-43
gr. Hans

3 Likes

Hi @mkubita

So, does my suggestion helps you out?

1 Like

Yes, thanks! That works great

1 Like

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.