Extracting words from tweets

helpmeplease · November 16, 2022, 12:24pm

Hi,

I’m working on a project, where I have a column named “Twitter Text”.

Besides the text of the tweet, I also have URL to this tweet in the same row/cell. How would I extract URL from every row?

The logic should be somehow similar to → extract only the word, which contains “https://twitter.com*” and append it as a new column (* - meaning, whatever comes after).

Thank you for the help!

ArjenEX · November 16, 2022, 12:40pm

Hi @helpmeplease

You can go various ways to do this. To get the most effective solution, it’s better to provide some example inputs and expected output

A quick way is a Regex Extractor node as it comes with a template to extract URLs. It’s available here Regex Extractor — NodePit

Connect it to your data, from the template menu select URLs and click Use. Set the output to Columns.

Example output of a string with the url included:

helpmeplease · November 16, 2022, 3:45pm

Thank you very much! Will try that!

system · February 14, 2023, 3:46pm

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.