Hi,
I have a dataset with a reoccurring issue. Whenever two capital letters appear in a row, the first capital letter should be removed.
Example:
George Washing
LAbraham Lincoln
Ronald Reagan
The “L” in front of “Lincoln” should be removed from the string.
Wondering if there is a way to implement this in removeChars for String Manipulation node or Column expressions node?
Thanks,
grossa8
Hey @grossa8 I was able to accomplish this in the following way:
use String Manipulation Node with regexMatcher: regexMatcher($column1$, "^[A-Z]{2}.*")
use Row Filter Node and set value: equals ... True
use another String Manipulation Node substr($column1$, 1)
Let me know if that helped you
Kind regards, Ricci
3 Likes
takbb
October 27, 2024, 12:36pm
3
Hi @grossa8 this can also be achieved using String Replacer with the following regex:
pattern:
[A-Z]([A-Z].*)
replacement text:
$1
And configured as follows:
The regex pattern detects any string that begins with two capitals, and then “captures” the string from the second capital onwards.
This should work fine, because if the regex pattern doesn’t match (e.g. if the first characters letters are not both capital letters) no replacement will be made.
3 Likes
Thank you Ricci, that worked!
1 Like
system
Closed
November 4, 2024, 1:44am
5
This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.