Text Mining for Dutch Language

Kees_Schippers · July 6, 2016, 2:53pm

Hi Knimers,

I was just wandering if there are possibilties (yet) for Mining Dutch texts?

Thanks,

Kees

marco_ghislanzoni · July 7, 2016, 8:59am

Hi Kees,

which use case are you looking at specifically? A number of nodes in the Text Processing group can work with Dutch already (e.g. the Snowball Stemmer node or the Hyphenator node) or are language agnostic. Other nodes are indeed English specific. Anything you are missing in particular?

Cheers,
Marco.

Kees_Schippers · July 7, 2016, 12:35pm

Hi Marco,

Most important I think would be the Enrichment possibilities. Part of speech, named entity. With most other things you can somehow find your way. But these two I think are essential.

Thanks,

Kees

marco_ghislanzoni · July 7, 2016, 1:17pm

Hi Kees,

Unless you can get it done with a generic node, like the Dictionary Tagger, I am afraid that implementing a specific Dutch support for those functionalities would require some custom coding (aka creating some sort of Custom Tagger node). Is that a possibility?

Cheers,
Marco.

Kees_Schippers · July 7, 2016, 5:20pm

Hi Marco,

You mean having someone build it for me. That is not an option I am afraid.

Thanks,

Kees

marco_ghislanzoni · July 8, 2016, 8:29am

Hi Kees,

just an idea, but maybe one of the many good Dutch IT students can work on this as part of his/her Bachelor or Master thesis.

Cheers,
Marco.

Taita · July 8, 2016, 3:07pm

The snowball Stemmer supports the Dutch language. Dutch stopword lists can be found on internet and added to the node. I think these are the most important elements. I haven't found a Dutch POS tagger and free sentiment dictionaries untill now.

Kees_Schippers · July 14, 2016, 7:38am

Thanks for your suggestions.

Kees

system · June 2, 2023, 9:48pm

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.