regular expressions are powerful way for cleaning & preparing data specialy text and web data. It would be excellent if we have an advanced regular expression node in KNIME.
I use a commercial RegEx tool (PowerGREP) for my task, but It's based on GREP (http://en.wikipedia.org/wiki/Grep) that is fast , powerful and publicly available.
regular expressions can be used for searching, filtering, replacing, merging, splitting data with complex patterns .
You can already use regular expression in the Row Filter node as well as in the String Manipulation node. If this is not enough the Java Snippet node is the swiss army knife.
hi,thanks for replay.
those are very basic solutions but I need for example to find 40000 RegEx patterns and replace them in my data (about 300k rows).
Row Filter can find one RegEx pattern each time and String Replace (Dictionary) can find strings not RegEx pattern.
40k regular expression sounds like a lot... You could use two loops, one over the regexes, the other over the data, but I suspect that this will be very slow. I'll check if a dictionary replacer with regular expressions is of general use.