Keyword Workflow

Hello, 

Would someone explain to me how to  put together a workflow that extracts sentences containing a keyword? I am fairly new to Knime so I am not to sure how to create one yet. 

Thanks!

Hello Jay,

you can first identify the keywords e.g. using the Topic Extrctor node or filter based on TFIDF scores. Then use the Sentence Extractor node to extract all sentences from documens. The sentences will be extracted as strings. Now use the extracted keywords as dictionary / filter to filter the sentence strings.

I hope this helps.

For more information about the KNIME Textprocessing extension see:
https://tech.knime.org/knime-text-processing

To get started have a look at the online documentation or the Introduction to the KNIME Textprocessing feature:
https://tech.knime.org/documentation-3
https://tech.knime.org/files/knime_text_processing_introduction_technical_report_120515.pdf

Cheers, Kilian

Hello k_jay,

I assume you successfully imported some data into KNIME. Now, you could use the 'Strings To Document' node to create documents. Within documents, terms and sentences are tokenized. To get the information which tokens exist in the data, there are the 'Bag of Words creator' node and the 'Sentence Extractor' node. The first one creates a 'list' of all terms occuring in the document. The second node extracts all sentences and returns them in a seperate rows. So in your case, you could use the 'Sentence Extractor' and use the 'Row Filter' afterwards. With help of the 'Row Filter' you can include only rows in the output table that contain your specific key word within each sentence.

I uploaded a small workflow for you.

Best wishes,

Julian 

Hi Kilian / Julian,

I am new to KNIME and trying to categorize/group a column of text (see attached csv). For example, row 4, 5, 6, 13 are 'price' related. Row 11 is price related as well but the exact word is not there. I have tried Topic Extractor (attached); however, I think my configuration is too simple.

I would highly appreciate any assistance please.

Syed