Lets say I have a column with a list of numbers attached to units and I would like to separate the numbers from the units so that I will have 2 new columns. Is there any simple way to achieve that?
There are of course different lengths of numbers and different units (which may also contain numbers). So basically what I'm trying is to split the column the first time a letter shows up.
Bug-free alternative: go and ask on StackOverflow with #Java and #Regex tags, this usually solves such challenges in minutes! :-)
Better yet: search stackoverflow for existing solutions: http://stackoverflow.com/questions/29434666/how-to-parse-and-capture-any-measurement-unit/29434667#29434667 (this one is for JavaScript, but I guess it should work with Java too).
(I think KNIME do not change the regular expressions in any way, but I can be wrong.)
I sometimes get confused with Regex dialects, though - Perl regex mildly different from Java regex, etc. Mostly due to language-specific reserved characters. Hence the recommendation to aks specifically for Java Regex.
You're right, it does work when I untick the "ignore case". For now this solution is sufficient for me since I don't understand too much about regex. Thank you very much!
Edit: I actually toyed a bit with the following website to try to understand regex a bit better:
https://regex101.com/r/M8LXnI/1
There I tried the regex expression as case sensitive and non-case sensitive and I happen to have the same issue when I choose non-case-sensitive. I used >180ppm/6H as an example. If I understand this correctly then the reason might be that (.*[0-9]) allows letters before numbers and ([A-Za-z].*) allows also a single capitalized letter and therefore >180ppm/6H gets split into >180ppm/6 and H. So I don't think it's a KNIME bug. However, thank you very much again!
You are absolutely right in your interpretation - .* allows any characters before the numeral (including other numerals!), whereas [A-Za-z] is matching a single alphabetic character, either upper or lower case, but not '>'