Twitter Hashtag Parsing

Hello Community,

I’m attempting to parse Twitter Hashtags from their tweet, ideally into separate words. For example, parse a comment such as “comment is comment and comments and more comments! #WeTalk” into “# We Talk”. I can isolate the hashtag using Wildcard Tagger, I can separate the # using Regex Split but I can not seem to split the terms or match the hashtags with the original tweet. I.E. I having issues obtaining a final product like the following:

[comment is comment and comments and more comments!] [We] [Talk]

Any assistance would be much appreciated.

Hi @fdickins,

if there is only one Hashtag per Tweet and every Hashtag is build with CamelCase then u can use RegexSplit and the pattern ([^#]+)#([A-Z][a-z]+)([A-Z][a-z]+).

Best regards
Andreas

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.