I need to filter the number sequences of 8 digits like in this example 12345678 and 12345677 in a new column but sometimes there are also 4 number sequences like this.
if used in e.g. Expressions node you need to escape the "":
\\b\\d{8}\\b
Regex Extractor node is good in that it easily outputs all matches (could not quite get this to work with Expressions node using regex_extract function - maybe someone else can have a go at that to avoid using Palladian)
I group by sample column and then aggregate using concatenate function - you can also change that function to set, list etc. or just keep the “Full Result” in its own row.
Thank you for your reply. I am working with knime 5.2.5 i dont find the “Regex Extractor” Node. My workflow looks like this if this might help to understand better. So the data was in an JSON and I transformed it with the JSON path node in a new column as a string.
Here’s the link to the installation guide as well:
I dodged it for a while, but never regretted installing it :-).
As long as your column is of type string it should work - you’d only have to point the Regex Extractor node to the correct columns (in my example it is the column “sample”).
It seems to me that the structure of a JSON file is not standard, the square brackets correlation is missed; besides that, there’s a comma missed in the group separation… the structure should be like this (being aware of quotation type):