Table Extractor "loses" WebDriver - loopable?

Dear KNIMErs,

I try to download some web tables and thought the Table Extractor from the Selenium Nodes is just perfect for that.

The problem is that I have a table that splits over two pages (80 entries), but the max I can show on the page is 50 entries. Then I click the “next page button” (pagination) and it shows the next 50 (or 30 in my case).

The problem I have is that I can extract the max page number and wanted to loop through that, but that seems difficult, as the TableExtractor node does not provide a WebDriver at its output.

So basically the flow I ideated

Loop Start -> Table Extractor -> check current iteration vs number of pages -> next page

does not really work as I am “looses” the WebDriver which is however required in each iteration.

I guess I am overseeing something here, maybe someone can push me in the right direction?

Thank you in advance!!!

Phil

1 Like

Hi Phil,

thanks for the feedback! The Table Extractor should probably have a fourth output port, which would just provide the input data as is. You can however just get the Web Driver column from the the node before the Table Extractor and continue working with this.

I would probably have a Y branch after the loop start, (1) going into the Table Extractor, (2) going into the “continue iteration?” check. Then re-combine the two branches again using e.g. flow variables or the Synchronize node:

Does this help?

–Philipp

3 Likes

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.