I have a situation.
I want to read all the sentences in the pdf and filter only those sentences which conatins certain words set of words from a different table. Kindly Help.
I have been to able to reach at some level so far.
Attaching my workflow for reference
Top Table - Parsed pdf as a single row, created bag of words and trying to extract seletive rows from colomn “MOA”
Bottom table - parse pdf, extract sentence, filter my rows of my choice and then extarct only relevant row.
Stuck at both places.
Kindly help and also suggest a better way to read a single pdf for certain words if any.
I didn’t spend a lot of time thinking about your approach, since I don’t have your original data files to play with, but I think I can at least help with the syntax errors in your rules that are causing the Rule-based Row Filter (Dictionary) nodes in both branches to fail.
Here you need to include some additional escaped quotes, like \", in a few places. Here’s how I modified the expression in the String Manipulation node in your bottom branch, for example:
Thanks for your reply.
This solved the error in the rule based filter node. After executing it creates an empty table but.
Attaching the source pdf file. I need to find paragraphs containing the words in the table creator node in the workflow.