forumnFeed.knwf (65.5 KB)
Hi Guys, I have been struggling to extract 5 digit numbers from filenames. It seems to be working in some cases and breaks at other instances.
forumnFeed.knwf (65.5 KB)
Hi Guys, I have been struggling to extract 5 digit numbers from filenames. It seems to be working in some cases and breaks at other instances.
Hi @shubhamss,
I think there is a wrong Pattern.compile entry: \b matches word-boundary but the underscore (_) is a word character (word character: [a-zA-Z_0-9]). I’m using \D (non-digit-character) outside the group-brackets to separate the 5digit matches. I’m also using a String-List to build the string-array.
The following workflow is a little bit shorter
and the String to Number
node only matches the Split Value
columns
I hope it helps
Andrew
Thanks, Andrew, it seems now pull the 5 digit characters from the string but some unwanted symbols are still left in the data. Also, I am not using Knime 4.0.
Where ArrInt is the size of each potentially 5 digit char. The code used is as follows:
Hi @shubhamss,
you are using a similar Pattern.compile entry as in your first workflow.
Best regards
Andrew
Thanks Andew, that works. One last questions is an email column is being shown like this:
but when I copy the cell value and paste in a rule engine it gives me something like this:
Would you happen to know why??
It looks like different character sets … US-ASCII (1 Byte) and UTF-8 or UTF-16 (2 Bytes) …
Hi there @shubhamss,
to avoid copy paste maybe you can use Rule Engine (Dictionary) instead
Br,
Ivan
I was just copy pasting as a means to checking why any cell with a “@” and a “.com” were not being extracted. The reason is there are unwanted symbols possible due to different character sets in place.
Hi,
ok. I get it now. Did you solve it or have a workaround? How did you obtain that column btw?
Br,
Ivan
This topic was automatically closed 182 days after the last reply. New replies are no longer allowed.