textprocessing node in knime

Hello. I want to do KNIME text processing in Korean Language.

Is the textprocessing nodes in KNIME custom nodes for English Language?

For example, the nodes in the picture can only be used in English?

I confirmed that Korean Language is present in the Redfield NLP model.

Any help is appreciated.

Hello @mychoi

As you mentioned in order to work with Korean language you need to use Redfield NLP nodes, still many other nodes from Text processing extension might helpful for you.

  • Tag Filter can still be used in case you first ran Spacy POS Tagger, so you can filter by different parts of speech;
  • Punctuation Erasure node, should be agnostic to language and you can run it just before tag filtering;
  • Stop Word Filter should be replaced by Spacy Stop Word Filter, since it should have a in-built dictionary for Korean. I assume by default Stop Word Filter from Text processing does not have a dictionary of stop words for Korean, however you can provide your dictionary to the second input port;
  • I do not know much about Korean, but probably Case Converter is not useful for Korean.

It would be easier for me to give more relevant advice if you could share a workflow with some input data and describe more what is your task and what you would like to do with your texts.


