TripAdvisor Web scraping for user reviews of Attractions

Hi!
Fresh new using KNIME.
I’m working in a study about destination’s perceived image. I need to create a Flow Work for several URLs all in TripAdvisor domain. Any ideas?
Thanks in advance.
Mar

Hi,
I assume trip advisor has an api you can call to get data. You probably need to sign up for a key first. Then you can get the data with a get request node
br

3 Likes

Hi @Daniel_Weikert !

I found several examples for WorkFlows with Twitter or Google, but none for TripAdvisor.
I’m using the WorkFlow in that conversation:
Datascraping TripAdvisor - #7 by qqilihq And is quite what I need.
But still not working. Hope @qqilihq could help :wink:

Feel free to share what you have and where you’re having troubles.

2 Likes

Thanks @qqilihq!
That is the workflow I’m using, but it stucks in row filter (sure because my lack of expertise).
TripAdvisor_Review_Scraping .knar.knwf (56.3 KB)
The idea is to extract info from the reviews as user name, gender (included in a pop-up window with more info about the user), country, rate, date and the review’s text.

Hi Thalassa,

You’ll need to get the CSS or XPath selectors right – currently the workflow selects one huge div which needs to be narrowed down.

I suggest you try looking at the DOM structure shown in the Find Elements node’s preview, and/or use the “Select” button to select an element directly in the browser window – play around with this for a while and develop an understanding about the page structure. The node will give you visual feedback when you edit the selector in the settings (directly in the preview and in the browser window).

.rev_wrap seems like a good candidate for a CSS selector to get the individual review elements.

From there on, try to refine your selectors (the ones in that wrapped, grey node “Extract Review Details”), so that you get the appropriate content.

For the popup windows you’ll probably need to build some custom logic which performs a click, extracts the desired content from the window, closes this, and moves on to the next.

Fingers crossed!

Best,
Philipp

2 Likes

Thanks @qqilihq! You are a lifesaver!
That’s almost working till the Node Click!
TripAdvisor_Review_Scraping.knar.knwf (55.4 KB)
I will left the pop-up extraction, as it will take me more time that the one I have.
Working in progress!
Thanks Philipp!

Mar, Thalassa

2 Likes