I need some solutions or suggestions. I’m having some columns where there is a subject for every incoming mail and outgoing mail. I would like to know whether it is a match based on subject and the outcome should come as “Matched” or “Not matched” based on the text and text is not the same always.
Here’s my Text Preprocessing component which enables you to perform quick cleanup on text without converting it to the Document type.
The reason why I’ve shared a bunch of text cleanup resources is because it can help standardize/normalize the text before you attempt to join it via the Joiner or Rule Engine nodes. This may not catch everything, but simple cleanup steps like converting all text to lowercase, normalizing the number of spaces between words, and removing instances of “RE” or “RE:” will help you correctly identify matches more frequently.
I hope one or more of these resources is helpful to you!