How to more elegantly replace strings in bulk using wildcards?

josephelsbernd · June 8, 2020, 10:18pm

Hi everyone,

I need some help finding a better way to search a cell with a string+wildcards and when it gets a match, replacing the entire entry with what I want.

I have a table with a list of disease names that I need to clean up. Below is an example of the sort of thing I am doing.

Current method:
String replace node with “huntington disease”

Input:
huntington disease
huntington disease HD
huntington disease and associated diseases

Output:
huntington disease
huntington disease
huntington disease

Right now I have 60 “String Replacer” nodes. Each coded to a different disease. Theres gotta be a better way to do this, right? Its really cumbersome to manage all of these separate nodes.

izaychik63 · June 8, 2020, 10:48pm

If you current rules are pretty simple, say LIKE or MATCHES you can put rules on file and use

node

scapuzzi · June 9, 2020, 2:04am

you could also find the minimal string needed to identity your disease and then use a wildcard pattern in the “string replacer” node. Hard to give a better solution without knowing more about the dataset.

ScottF · June 9, 2020, 3:22pm

Have you thought about a similarity based approach? For example:

josephelsbernd · June 9, 2020, 8:46pm

izaychik63 - I tried the rule engine but it deletes entries that don’t need cleaning. So my column I’m trying to clean ends up depopulated.

scapuzzi - That’s the method I have been using. The problem is I have ~60 individual nodes (one for each search term). Is there a way to use a table of search terms with wildcards?

scottF - Thanks for the suggestion. I must admit I am not sure how to implement this to do what I need but I can see how something like this would be powerful.

izaychik63 · June 9, 2020, 8:51pm

@josephelsbernd, to keep records you need to add one more rule as a last one
TRUE => $Input$

josephelsbernd · June 9, 2020, 10:38pm

@izaychik63 That is great! Its working now. Thanks a ton!

system · June 16, 2020, 10:38pm

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.