xpath node to extract all user posts in a facebook page

f.villarroelordenes · April 29, 2014, 5:03pm

Hello!,

I am starting a research project about online conversations in social media and I will like to use knime to extract and process information from a social media platform. I am familiar with the text-processig part in Knime, but now I will like scraping the data I need by using the nodes from Palladian.

I am trying to parse a Facebook page and extract two colums. The first one related original posts of the facebook page or other user/friends and second one with the company or user/friend text replies to the original comment. I have checked the manual online manual/white paper in which it is explain how to use the xpath node of palladian (http://www.knime.org/files/knime_web_knowledge_extraction.pdf ) , but I don know which xpath querries, names or prefixs I could use to parse the facebook page and obtain the content that I want.

Please it would be great if somebody could help me with some advice about how to write an xpath querry... a simmilar example of with a twitter page would also help me.

Thanks!,

Francisco

qqilihq · April 30, 2014, 8:58am

Hi Francisco,

I'm not sure, whether this approach will work well for Facebook, as the page is relying heavily on JavaScript, which pulls in most information after the actual page load. However, I haven't tried it.

The general approach for using XPath is to open the page in question in your Web browser and use some DOM inspection tools (WebKit based browsers, such as Chrome or Safari have that already included) to find out the XPath which you can then insert into the XPath node.

Depending on your actual use case, maybe using the WebSearcher node might simplify things. It's able to search on Twitter and Facebook (and many more).

Best,
Philipp

f.villarroelordenes · May 3, 2014, 3:14pm

Hi Philip. thanks a lot I will try what yo suggest!,

Best,

Francisco

marcus · September 29, 2015, 5:44pm

Hi Philipp,

The Web Searches, don't display the option for search engine from Facebook.
Is it possible just Social Mention (option)?

Thank you Marcus

marcus@estanislao.com.br

qqilihq · October 1, 2015, 10:07pm

Hi Marcus,

apologies, I overlooked your post. Facebook unfortunately removed their search through their Graph API, thats why it's no longer available in the WebSearcher node. Social Mention would be an alternative, though I haven't used it for a while and I cannot tell you how "complete" the results are.

Another more recent option would be building a workflow using our new Selenium nodes and performing a search via simulated browser and extract the desired results.

Hope that helps,

Philipp

marcus · October 1, 2015, 10:36pm

Thank very much you Philipp, I will try.

Best.

Marcus

system · April 21, 2023, 9:40pm

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.