Source code of BoW Creator and Dictionary Tagger

KNIME is a very useful、strong and convenient tool for data processing.

But I found some problems in processing Chinese text by using Text processing.If I develop a new node to process Chinese text well,how can I get their soure code(eg Bow Creator and Dictionary Tagger) so that I can make sure my new node can work well with subsistant nodes in the Text procesing?

The sources of all KNIME nodes can be installed as separate features. Have a look at the "Sources" category on the update site.

Thank you.But I don't understand completely,where can I find "update site" ?Can you give a link to the site?

Thanks thor.I find them.They are beautiful~~

hi nuaaer,I met the same problem that knime can't deal with chinese well . I foud that knime uses OPENNLP to deal with texts which couldn't support chinese and it's hard to disguise a new node to process Chinese text in the text processing plugin. So have you foud a better way to deal with chinese?

OPENNLP is used to tokenize the text and for part of speech tagging, was well as named entity recognition. I never processed chinese texts but I guess as long as if You don't want to part of speech tag Your text, recognize named entities or stemm the text it should work. Is the chinese text encoded in UTF-8? What problems do You have when processing chinese text?

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.