text processing - creating a filtering knime workflow

hey, 

 

i am a new user of KNIME and also just started looking into data mining. i have created a document where i have extracted data of a social media forum and put it in a csv file.

i want to create a worflow on knime where i   I can filter a word for example 'teens' and all the data that is related to that keyword is displayed as an output.  I am struggling as to where to start and how to implement this. I have searched on-line and nothing seems to be helping. is there any suggestions you can give me as to where I can start my workflow.

Hi,

filtering by string can e.g. be achieved using the standard "Row Filter" node: You can apply exact matching, wildcards such as *keyword*, or even regular expressions.

Kind regards,
Philipp

HI, 

 

thankyou for the reply. do i add my data into the file reader node then use the row filter node. Also which node to i use to view the output data? 

i just started using data mining two weeks ago, for a university final year project module so i have no idea as to where to start. 

 

Kind Regards, Maariya Rashid

Hello Maariya,

i just started using data mining two weeks ago, for a university final year project module so i have no idea as to where to start. 

I am happy to welcome you to the KNIME community.

There is a Beginner's guide to help you getting started. There are also several tutorials on our Youtube-channel.

For your specific problem you might want to work through the Text Mining webinar.

Best,
Ferry

hey, 

thank you for the suggestions they are really helpful,

 

kind regards, Maariya  :)

Hi Maariya,

for a bit of advanced techniques you could create n-grams and filter them based on the terms you are interested in, to see co-occurring words. Alternatively you can also use the term co-occurrence counter node.

Cheers, Kilian

Hey, Killian

thank you that was really helpful.

 

kind regards, Maariya :)

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.