Sentence Extraction without loosing the original comment

mhelmy1950 · August 5, 2013, 11:07am

Hi,

I would like to do sentence extraction without loosing the original comment through the text preprocessing. I saw some featutre about using java snippet but I could not make it work. I would appreciate your help to know how to do it or if there are another solution. An example will be helpful as well as i am new to use knime !

Thanks

Mohamed

kilian.thiel · August 5, 2013, 11:25am

Hi Mohamed,

could you please explain a little bit more detailed what you mean by "loosing the original comment through the text processing". I guess you want to end up with a data table consisting of a column containing the extracted sentences and another column containing the original text the senteces was extracted from, is that correct?

Cheers, Kilian

mhelmy1950 · August 5, 2013, 11:30am

Hi Kilian,

Yes this is correct. I have a file has review comments. I used sentence extractor to extract sentences. Then I used Extended NER prerpcessing. Then used BoW creator but the output table give me the sentence and lost the original review comment. Can you help please how to generate an output table has the review comment, sentences extracted, Terms, etc.?

Thanks

Mohamed

kilian.thiel · August 5, 2013, 1:03pm

Hi Mohamed,

i assume you are using the Strings to document node to create the documents. After that node use the sentence extractor to extract the sentences. To get the original text, the document was created from use the Document Data Extractor node and specify "Title", and "Document Body Text" to extract (only available in KNIME Textprocessing 2.8).

Attached you find an example workflow.

Cheers, Kilian

sentenceextraction.zip

mhelmy1950 · August 9, 2013, 11:03pm

Thanks Kilian but the only problem here is the first row of extraction in the sentece column appears as the full comment, how can I exclude this from the result. see the attached

sentence_extraction.jpg

kilian.thiel · August 10, 2013, 2:47am

The title is extracted as a sentence itself as well. I guess that you are using the full comment column as title and text column? One approach would be to create generic titles for each comment like "commen 1", "comment 2". Therefore create another column e.g. by the java snippet node right beside the full comment text column and use this column in the Strings to Document node as title column. Later on you can easily filter these rows with a Row Filter.

mhelmy1950 · August 11, 2013, 2:32am

Thanks Kilian, can u pass me a code please as I am still new in this, from java snipet to allow me create new column that do the folowing:

if value of column 1 = value of column2 then T else F

Thanks

kilian.thiel · August 11, 2013, 5:48pm

To create a comment title like "Comment 1", "Comment 2" etc. the code for the Java Snippet node would look like:

return "Comment " + ($$ROWINDEX$$ + 1);

To create Rules as you mentioned in your last post use the Rule Engine node. You do not have to write java code when using this node.

system · June 2, 2023, 9:50pm

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.