Regex split to split the string without breaking the word

I need to split the string if it exceeds certain length, but it should not break the word inside the string.
For example:
Sting_output_column= “knime analytical platform Split Regex is used for splitting the string into different rows”

I need to break the “Sting_output_column” into different rows of certain length(20). How to achieve it without breaking the word.

Hi @mathi,

I think the easiest and cleanest solution would be a Java Snippet instead of regex in this case.
While I love regex, this problem is easier solved using a loop and having checks for newline/max character per line

Something like this:

I guess you could do the same in a KNIME Loop with split string to words

1 Like

Earlier I had implemented my solution using java snippet, but thought of doing it using knime components instead of JAVA. will try to implement the WF using loop. Thanks!

Hi @mathi,

You can use this regex in the Regex Split node:
(.{1,19}[^\s]*)?\s?

The only issue here is that you need to repeat it enough times to let the longest string in your table be also a match.
To overcome this issue, I have built a workflow which creates the pattern for the Regex Split node automatically:

23987-1-1.knwf (55.4 KB)

:blush:

4 Likes

This topic was automatically closed 182 days after the last reply. New replies are no longer allowed.