Is there a Knime node to work with Wordnet?

Is there any node in Knime for extracting a webpage in wikipedia in order to make a corpus?

Also is there any node in Knime to connect to Wordnet in order to find synonyms?


there is no dedicated node to do so. However, the combination of the KNIME extensions XML Processing and Palladian (see community contributions) allows for the extraction of data from web pages in general. Palladian provides the node HTTP Retriever which downloads the content of a web page. The node HTML Parser parses the conent and creates an XML cell. With the XML nodes e.g. XPath data can now be extracted from that XML in a structured way.

You could downlod webpages from wiki, extract the text fields you want to use and then use the text processing nodes on these texts.

Cheers, Kilian

Dear Kilian

Thanks very much of the information. I understood that I can download webpages from wiki and extract the fields and do text processing. After thet I want to use Wordnet to look up hypernyms and hyponyms of words. Also I want to calculate the space between words. I know that Wordnet has an Api for C#. Can I use Wordnet through Knime?

Thanks in advance, 

Hi Negaresma,

do you know if there is any Java Wordnet API?
If there is a REST API you can use the KREST node to access Wordnet and send queries. If there is no Java or REST API you can still extract the content from the webpages. Therefore use the Palladian nodes to download the content and the XML nodes to extract the information you need. You can do this with any website that does not require session information. Attached is an example workflow that sends two words as queries and extracts the pos of these words.

Cheers, Kilian

Hi Kilian,

I have a list of words and I am interested in extracting related terms of those words from Open Dutch Wordnet/Cornetto using Knime.

I had a look at the example workflow that you posted. However, I do not have url for dutch wordnet database for Java Edit Variable node. I also don’t have REST API to access Wordnet and send queries.

I notice that Wordnet can be accessed using Python or Java. Is there a node that has similar function as Python/Java and can access Wordnet or Cornetto to find related words or synonyms?

Thank you in advance!

Hi @jsrl and welcome to the forum.

Right now we don’t have a node for accessing Wordnet or Cornetto directly. If this is something that would be useful to you, I can create a feature request in our system. (There are a few other folks that have been asking about Wordnet and Sentiwordnet in particular in recent months.)

But for the time being the best workaround be be the Python or Java libraries you mentioned.

