Extracting Only sentences containing tagged words (after dictionary based tagging)

Hi,

I am wondering if there is a possibility to extract ONLY sentences where the tagged words are found. I tried (with various settings) but sentence extractor was extracting all the sentences of the documents.

Best,
Mateen

Hi @mateenraj , would it be possible to share what you have done?

Hi @bruno29a,

Please find attached.

DictionaryBasedTagging_old.knwf (100.0 KB)

Hi @mateenraj , I can’t see the document itself to test this, but can you try with the “Exact match” option in the Dictionary Tagger?
image

Hi @bruno29a,

Yes, tagging is not a problem but after getting the list of matched words, I would like to have the whole sentences instead of just the matching words.

Hi @mateenraj,

one thing that you could do is that you use the Sentence Extractor to get all sentences. Afterwards you convert the sentences into documents using the Strings To Document node, then apply the tags, count the number of tags per document/sentence using Bag Of Words, Tags To String and a GroupBy and then filter all the documents with no tags at all. In the next step you could extract the strings from the documents again using a Document Data Extractor and turn the remaining sentences into one string and/or document if needed.

I hope this gives you some ideas.

Best regards,
Julian

3 Likes

Hi @julian.bunzel,

Thanks for suggestion. I tried but it seems not working for me, not sure where is the problem. Pls find attached the workflow.

Best,
Mateen
DictionaryBasedTagging_old.knwf (180.5 KB)

Hi Fellows,

Any suggestion how to solve this issue ?

Best,
Mateen

Hi @mateenraj,

You have to set the same tag type when extracting terms. In the modified version of your workflow I have also changed the tokenizer to whitespace.
DictionaryBasedTagging_old.knwf (39.6 KB)

2 Likes

Hi @armingrudd,

I see, thanks.

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.