Document Similarity Prediction

Hub · October 12, 2020, 9:45am

This application is a simple example on how to use the Text Processing Components. The first step is to search for PubMed documents. Afterwards the documents are preprocessed and the Document Similarity Learner and Predictor are used to find similar documents. The view shows the documents and their associated similarity score.

This is a companion discussion topic for the original entry at https://kni.me/w/ZMktF-DBCAijwB7l

jhonguzman88 · October 19, 2020, 4:37pm

Best regards, I have a question

What if the text contains numbers?

ScottF · October 19, 2020, 8:42pm

It looks like it’s not being used in the preprocessing component of this workflow, but there is a separate Number Filter node to help you deal with numbers in your text.