I’m relative new in KNIME and I’m facing few difficulties. I have a Table file which I imported/read in my workflow. This file includes various tweets from different user in the column “Tweet” and this column has the whole text as well as all Hashtags. I want to get these Hashtags in a new column and remove all text around. Is there a simple way to do it. I tried different nodes but so far wasn’t successful in my approach. As an example below my data (I only want to get marked words):
Thank you in advance
welcome to KNIME Community!
Usually (in simple cases) to get a substring from a string one can use substr() function from String Manipulation node. However that won’t work in your case. So I share with you workflow from KNIME Hub which analyzes Twitter data. Take a look and hopefully you’ll manage to find a way to extract hashtags. If not or have some questions feel free to come back to this topic and I’m sure someone will give you a hand.
thank you for your reply. I tried it out for my workflow but unfortunately it couldn’t extract all the Hashtags from my tweets. Is there any other way to do it?
Thank you for your help.
You can do it with the “Regex Extractor” node (Palladian extension)
something like this …
Regex extract.knwf (13.3 KB)
I choose the Rows option for output you can also choose list, columns,…
You did extract some but not all? There is Row Filter inside Extract Hashtags Metanode that leaves only top 100 hashtag based on count. Maybe that’s the reason?
Another option is regex as @andrejz demonstrated as long as you write it good enough
thank you it worked quite well in that case
This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.