Topic Extractor in non-English languages

How well does Topic Extractor node perform in languages other than English?

Thanks

Very related with my original post: topic extraction relies on lemmatization. As far as I understand, KNIME text processing only offers lemmatization for certaing languages: since lemmatization is offered via Stanford lemmatizer, which in turn depends on results from the Stanford POS tagger (English, French and German), I find no way of implementing this for Spanish. Any alternatives?

Regards

 

Hi Peleitor,

the Topic Extractor node should be language independent, lemmatization is not neccessarily required for that node. Simply apply it on a preprocessed set of documents and specify the number of topics you want to extract.

Cheers, Kilian

 

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.