How to replace partial strings using a dictionary?

final edit : added image of final workflow

Hi,

I’m almost giving up what looks like a simple task. Maybe you can help a newbie ?

I have a file like this

original table

CategoryA

NameA

CategoryB

Name B

Cat1

name1

Cat1

name1

Cat1

name2

cat2

oldname2

Cat2

oldname1

cat1

name2

Cat2

oldname2

cat2

oldname1

And a dictionary like this

Oldname (lookup reference)

newname

newcategory

Oldname1

newname1

newCat2

Oldname2

newname2

newCat3

 

I need to change « oldname1 » to a specific new name, described in a dictionary (~500 entries) AND change its category to a new category name, to get a file like this:

destination table

CategoryA

NameA

CategoryB

Name B

Cat1

name1

Cat1

name1

Cat1

name2

newcat3

newname2

newCat2

newname1

cat1

name2

newCat3

newname2

newcat2

newname1

Also, I need to do this in ~40 pairs of columns (category A, category B and so on).

 

I can’t do it.

I tried to combine all columns and then do a “string replacer –dictionary”. Then I realised it does not work on string that are part of longer strings. For the same reason I can’t use cell replacer.

Also

Rule engine would need a terribly long rule to write, and then apply in a loop 40 times…seems awful. Worst part being the dictionary will be frequently updated.

 

Any ideas ? Any help greatly  appreciated.

Thanks a lot,

alix

Hi Alix,

unfortunately there is nothing like a multicolumn cell replacer or string manipulation node (yet).

You could concatenate Cell Replacer nodes instead to update all name-columns, but as your dictionary has no reference to the old category name, you will not be able to update the categories the same way. To update the categories you can also use a cell replacer, make a lookup on every name-column for the now updated names and append the new category as a new column. Next you can update the original category column with the values from the just appended column with a Rule Engine. The solution is not too fancy but does the job though. For clarification I added a workflow for your example data.

Hope that helps.

Cheers,
Marten

Marteen,

I owe a great big thanks for taking time for my issue. We both come to the conclusion Knime doesn't facilitates using a dictionnary to change strings inside of longer strings.

I went through your idea, which i would never had thought of. Though, i don't know how to scale it with the 75 pairs of column I have to modify.

In the meanwhile, I explored another way, by combining columns by pairs using java snippet and then use cell replacer in a loop. It's still an ugly workflow to debug, but seems to do the trick.

Draft worklow attached, any comment or questions welcomed.

Regards,

 

good morning!

Maybe this approach can give you a hint how to solve it:
https://www.knime.com/forum/knime-general/solved-select-replace-for-the-whole-table

Good evening ;-)

using java snippet to modify string is indeed the usual straightforward way, though I cannot use it here, because the replacement is dictionary based . With more than 500 replacement possbilities, the java code would be quite long !

thanks a lot for your help,

Please see this example usefully provided by Knime to do exactly this: