Looks like your task is fairly complex for string cleaning. However, you can also use Rule Engine node to replicate the code snippet in KNIME.
As a preliminary step you would also need to provide a dictionary mapping to your component or download it from somewhere to keep things moving.
As you have alluded to, this component makes use of java snippet to perform elements of filtering of strings and may give some inspiration. I don’t think you’ll find a “standard nodes” solution to what you are trying to do, or if you do, it’ll probably be so convoluted you’ll wish you’d just written it in java
By all means open it up and take a look inside!
Using regex with unicode character classes rather than the specific ascii character subset is probably the way to go for things like “UKEULetters” and so forth.
Yours is also great. I did find a small bug, that crept in. On your String Manipulations for lower case and pascal case they are referring to the column name $output$ with a lower case “o” instead of $Ouput$. How ironic that it should be the case that causes trouble.
But nice one, and feel free to tag me if you event want to collaborate over building components.