N-Gram Extractor

@qqilihq , could you please help to set N-Gram Extractor to get same result as my second NGram Creator. I made this example because the first NGram Creator produces unexpected result, so I try to find a reliable solution using N-Gram Extractor.
Thank you
Salud Example.knwf (37.3 KB)

Your example string contains non-breaking white space at the end (below replaced with _ for clarity), that’s why it’ll create two n-grams for this string

Diagnostic Accuracy  70.0% NPI  1111111111___

Solution: Trim the string before creating the n-grams and explicitly remove the non-breaking white space (\u00A0 ), e.g. using a String Manipulation:

strip(regexReplace($column2$, "\\u00A0", " "))

Updated workflow on my NodePit Space:

Does this help?

– Philipp

1 Like

Thank you, Philipp. It works
If you don’t mind, could you please explain why the first WF does not return result with “Diagnostic Accuracy 70.0% NPI 1111111111”?

You mean the one using the green “NGram Creator” node? Sorry, I can’t, this node is not developed by me. Maybe someone from KNIME can help.

That’s OK. Thank you for your help.

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.