Document Similarity Prediction

This application is a simple example on how to use the Text Processing Components. The first step is to search for PubMed documents. Afterwards the documents are preprocessed and the Document Similarity Learner and Predictor are used to find similar documents. The view shows the documents and their associated similarity score.

This is a companion discussion topic for the original entry at

Best regards, I have a question

What if the text contains numbers?

It looks like it’s not being used in the preprocessing component of this workflow, but there is a separate Number Filter node to help you deal with numbers in your text.