Source code of BoW Creator and Dictionary Tagger

nuaaer · March 22, 2012, 3:45pm

KNIME is a very useful、strong and convenient tool for data processing.

But I found some problems in processing Chinese text by using Text processing.If I develop a new node to process Chinese text well,how can I get their soure code(eg Bow Creator and Dictionary Tagger) so that I can make sure my new node can work well with subsistant nodes in the Text procesing?

thor · March 22, 2012, 4:02pm

The sources of all KNIME nodes can be installed as separate features. Have a look at the "Sources" category on the update site.

nuaaer · March 23, 2012, 10:03am

Thank you.But I don't understand completely,where can I find "update site" ?Can you give a link to the site?

nuaaer · March 24, 2012, 5:10am

Thanks thor.I find them.They are beautiful~~

susanna8930 · April 8, 2012, 3:44am

hi nuaaer,I met the same problem that knime can't deal with chinese well . I foud that knime uses OPENNLP to deal with texts which couldn't support chinese and it's hard to disguise a new node to process Chinese text in the text processing plugin. So have you foud a better way to deal with chinese?

kilian.thiel · April 10, 2012, 7:45pm

OPENNLP is used to tokenize the text and for part of speech tagging, was well as named entity recognition. I never processed chinese texts but I guess as long as if You don't want to part of speech tag Your text, recognize named entities or stemm the text it should work. Is the chinese text encoded in UTF-8? What problems do You have when processing chinese text?

system · June 2, 2023, 9:50pm

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.