N-Gram Extractor

@qqilihq , could you please help to set N-Gram Extractor to get same result as my second NGram Creator. I made this example because the first NGram Creator produces unexpected result, so I try to find a reliable solution using N-Gram Extractor.
Thank you
Salud Example.knwf (37.3 KB)

Your example string contains non-breaking white space at the end (below replaced with _ for clarity), that’s why it’ll create two n-grams for this string

Diagnostic Accuracy  70.0% NPI  1111111111___

Solution: Trim the string before creating the n-grams and explicitly remove the non-breaking white space (\u00A0 ), e.g. using a String Manipulation:

strip(regexReplace($column2$, "\\u00A0", " "))

Updated workflow on my NodePit Space:

Does this help?

– Philipp

1 Like

Thank you, Philipp. It works
If you don’t mind, could you please explain why the first WF does not return result with “Diagnostic Accuracy 70.0% NPI 1111111111”?

You mean the one using the green “NGram Creator” node? Sorry, I can’t, this node is not developed by me. Maybe someone from KNIME can help.

That’s OK. Thank you for your help.