Datascraping TripAdvisor

Dear Team,

Is anyone familiar with how to develop a workflow to scrape reviews and ratings for hotel properties on TripAdvisor?

Hi,

here’s one.

– Philipp

1 Like

Great, can you please share a sample workflow that works?

Great, can you please share a sample workflow that works?

No, because I don’t have one which I could share.

I suggest to have a look at the following thread, which explains the fundamental concepts about how to scrape data from Yelp. At the end of the thread, it contains a working example workflow, which you can adapt to Trip Advisor:

In case of specific issues, feel free to ask here.

– Philipp

Thanks.

Hi Philipp,

I worked through the yelp example fine, but when I tried adjusting the Find Element node in the workflow example to accommodate the TripAdvisor web page, I got empty tables. This is the web page I am trying to scrape for reviews: https://www.tripadvisor.com/Hotel_Review-g616286-d151284-Reviews-The_Crane_Resort-Saint_Philip_Parish_Barbados.html. I hope you can help. Thanks.

Best,
Ryan.

Hi Ryan,

can you highlight how you configured the Find Elements node? Probably, it’s just the ‘query’ settings which needs to be adapted.

In general, I’d suggest having a look at this awesome tutorial by @armingrudd, which gives a good starting point for the Selenium Nodes:

https://blog.statinfer.com/rule-the-web-with-selenium-nodes-in-knime/

Best,
Philipp

3 Likes

Hi Philipp,

Thanks for the reference. I have tried changing the ‘query’ settings for the Find Elements node to no avail. I have tried entering span, div.hotels, q.hotels, div.review under the css selector and have also tried adding the xpath using the method described in your reference. In some instance, it ran and the node produced output, but the Quit Webdriver node simply produced an empty table. Please advise.

Best,
Ryan.

I’d suggest posting your WF, and to point out where exactly you’re having trouble.

– Philipp

Hi Philipp,

I identified where I was going wrong. I didn’t realize I had to make changes in the Extract Review Details node as well. Hence, I am now making progress. I will post here if I am stuck again. I really appreciated your kind help and patience.

Best,
Ryan.

1 Like