Hi,
I am a newbie in KNIME and I have an issue on extracting a number to a string. The number that I need are in different locations. Sample are below:
81727904HG ASDHASD 275 EE
Resource - 70331103
Ma 922803694FF400g*24
Do you have more examples of strings with numbers you’re trying to extract? From what you’ve provided it looks like a regex string that matches numbers with more than 4 digits would suffice.
I used the Regex Extractor node (part of the Palladian collection), with the string \d{4,20} which finds numbers with 4 to 20 digits. If you have more complicated data then this may need to be modified.
I have attached the sample of strings. actually there are rows that do not have a “code” that i need.
The file has two columns, a short text and a code. in the short text, there are items that includes the codes.
Hi @AlyKnime , you should first confirm if the logic proposed by @elsamuel makes sense, that is “finds numbers with 4 to 20 digits”. You can see why he’s doing greater than 3 digits, because for example, from “81727904HG ASDHASD 275 EE”, if you extract the numbers without any restrictions, you would get 81727904 and 275, similarly for “Ma 92280369FF400g*24”, you will get 92280369, 400 and 24.
So, do the numbers you are targeting always have a specific range of digits (minimum and maximum)? Are they always between 4 and 20 digits?
Be advised though that if you have data that may contain 2 sets of 8 digits, then both sets would be return. For example “81727904HG ASDHASD 27527527 EE” would return 81727904 and 27527527.
It’s done via Regex Extractor as it’s straight forward for what you need - extract the numbers. You cannot do this in 1 operation with String manipulation.
Is Regex Extractor an issue because you don’t have Palladian extensions?
EDIT: Correction: After viewing what @mehrdad_bgh has done, it looks like it can be done in 1 operation with String manipulation with the regexReplace. @AlyKnime , the regex probably needs to be adjusted. I’m not too strong with regex, so I can’t really help for the expression.
Hi @mehrdad_bgh , just wanted to ask why do you think it captures the 17157381 instead of 81715738? when the “8” comes first than “1”? Just confuse on the logic Sorry and thanks for your help!