Matching Text (Text meaning Similarity)

Hi Knimers!

I trust that all of you are well.

I have a text matching task to do. It’s more of a similarity task, but it needs to be based more on the meanings between the two datasets.

See below:

Dataset 1 - Items - Main Dataset

Dataset 2 - Commodities - Reference Dataset

From Dataset 1, I need each item to match to the closest Commodity from Dataset 2. For example, if “Cable or wire lug” from Dataset 2 is the closest to “Wire Earth 70MM stranded” from Dataset 1, in terms of meaning, then it must match.

If this is possible on KNIME Analytics, kindly assist me with the solution?

Any suggestions will be highly appreciated.

Kind regards,
Denzil.

Hi
have you already tried similarity search?
https://hub.knime.com/knime/extensions/org.knime.features.distmatrix/latest/org.knime.distmatrix.similaritysearch.SimilaritySearchNodeFactory
br

Hi Daniel.

Thank you for your response.

Yes, I tried Similarity Search. However, it matches the distance in words between the two terms (actual words), not actually the meaning.

Thank you.

Hi @denzilsdn
If you want semantic similarity you would need to embedd the sentences/words into vectores. The new KNIME AI nodes have embedding connectors. (Some of them require API keys)
br

2 Likes

Hi,
@Daniel_Weikert has already pointed in exactly the right direction. Apart from the embedding nodes in the new AI extension, we also have the BERT Embedder node from our partner Redfield. It also converts sentences to vectors where semantically similar sentences are closer to each other in vector space.
Kind regards,
Alexander

2 Likes

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.