Text Processing finding context

Jowawa · March 9, 2023, 9:00am

Hello everyone,

I have following problem.
I have an Excel table with sentences and in some of the sentences are last names. Im trying to filter out all rows that contain last names which is somewhat working already.
The problem I currently have is, that in some cases it filters out the wrong rows, since some last names in german are also normal german words. An example would be “Stein” which is a last name but also means “stone”, which is used in some sentences. Is there a way that the workflow figures out context and realizes its a stone and not a last name?

ScottF · March 9, 2023, 3:11pm

Hi @Jowawa -

If you have a list of common German names, you could provide them as a reference to a Named Entity Recognition model, and train that model to identify previously unseen words that might be names. Here’s an example workflow that uses several nodes from the KNIME Textprocessing extension to do that:

system · June 7, 2023, 3:12pm

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.