Selenium Nodes in Knime

This is a never-ending issue it looks like… So my loop worked before I used an Error Handling node, and the workflow was breaking for the right reason. Now that I used Catch Errors (Data Ports) I get the following error “Encountered loop-end without corresponding head!”. I am not sure why it even goes to the loop end after the first row in the loop?
I am so close, I need it sorted :slight_smile:

When using the error handling construct in a loop, use a table which has exactly the same table structure as the main flow input for the second input port of the Catch Errors node or check the “Allow changing table specification” option. In your case the former is preferred.

Here are two examples in a workflow:
Selenium_Error_Handling.knwf (131.4 KB)

In the top flow, I used the return missing value option.
In the bottom flow, I used the error handling construct. If you check the Table Creator (Node 14) you can see that I have provided the same table structure as it is in the output of the Column Filter (Node 23).

:blush:

2 Likes

I really appreciate your help, Armin! I guess it is becoming a bit more complex for me to follow. I am attaching my workflow.
Find Elements node next to Node33 is the one that is failing on the Jim Bim record because the business number is incorrect.
What I am failing to do is 1. catch the errors and 2. on error report this business number as error and 3. redirect the flow to do something else (in this case, press on ‘return to welcome page’ and go through the terms and agreement again while using the next in line account). Seems like this task is over my head. I am hoping you can help me achieve this or something close it. Thank you so so much!
.Commodity Tax Validator for Forum.knwf (64.5 KB)

I modified your workflow:
Commodity Tax Validator for Forum.knwf (72.2 KB)

:blush:

1 Like

Thank you for modifying it! It is a ted different though: you send every result through ‘Return to welcome page’. This is not the intent. For the successful result it needs to envoke New Search while for the unsuccessful it has no other choice but to return to welcome page. Can this be achieved?

Use Navigate node and start the loop on this node or use If Switch node.

:blush:

1 Like

I must be dumb. At Missing Value node I added the Row to Variable port, then Rules Engine for variables, then connected to If switch and up to this point I get the correct direction of the port. At this point I am trying to connect Find Elements and the node does not want to connect, it throws an error 2019-06-21_17h08_56|544x208
Why?

Here you are:
Commodity Tax Validator for Forum.knwf (82.5 KB)

:blush:

1 Like

Pure awesomeness! Thank you very much!!!

Further question: in my original workflow when the loop was over I had Quit Driver node. In the original, the browser would close when done with the loop. Somehow in the updated workflow, whether I add quit driver or not, the web browser does not close. Is it because it is a pulled driver node that we used?

1 Like

Yes and No! You cannot quit a WebDriver which is started by a Get Pooled WebDriver but here we had also removed the WebDriver from the flow before reaching the loop end (to export data to csv).

Here is the new version which quits the browser as well.

Commodity Tax Validator for Forum.knwf (88.7 KB)

:blush:

1 Like

Hi Armin,
I have been testing it on the data and ran into an issue. Not sure why but both rows below are intentionally errors and both go through the same path, only John Smith one fails. Can you help?

922897R Jim Bim 877789080 2019/06/24
922897R John Smith R121396592 2019/06/24

This is because the second example produces 2 error messages and therefore 2 rows is created afterwards and the Click node stops with an error.

Solution:
Check the “Extract first match only” option in the configuration window of the Find Elements (node 88 in the workflow I have provided in my previous reply)
I have updated the last version of the workflow in my previous reply.

:blush:

2 Likes

With your timely and knowledgeable help, Armin, I was able to do exactly what I wanted to do. Thank you, thank you, thank you!!

I was able to set up the same-ish workflow for another website of similar nature. However, I have a few things that I am struggling with and hoping you can help :pray: :heart:

The attached workflow is not achieving the following:

  1. not handling the error when TQ number is missing (second row in test table);
  2. writes all variables into the table only for TQ record, not other records;

I have noticed that the website may refresh a lot during the time that I am scraping results, so I am wondering if it can also be done differently (get all [span [3]] values at once and then somehow transpose them into columns? I tried but none of collections, and ungrouping don’t seem to work with webelements (or it must probably be my lack of skills).

At the end of the day I want to achieve: entering main number, entering TQ if required, submit the form and if I hit errors just declare “Error” or if I get results, scrape the results and write them into csv. Same as before but with a differently designed website.

Please help!

Commodity Tax Validator - QST.knwf (115.4 KB)

Wow, impressive workflow! And kudos to @armingrudd for his awesome support!

  1. not handling the error when TQ number is missing (second row in test table);
  2. writes all variables into the table only for TQ record, not other records;

If I understood correctly, it wouldn’t make sense to query in case of missing TQ numbers anyways? How about just skipping these rows by filtering them out with a Row Filter node? If I misunderstood things, please let me know.

(get all [span [3]] values at once and then somehow transpose them into columns?

At which location in the workflow does this happen? I couldn’t figure this quickly.

Best,
Philipp

PS: Unrelated to your question, but I noticed you’re using several “Wait” nodes in your workflow and a [x] Wait for … setting in following “Find Elements” nodes. You can probably get rid of the “Wait” nodes – the “Wait for” is more intelligent, as it’ll only wait until the desired element becomes available (whereas “Wait” will always sleep for the specified amount of time).

2 Likes

Re: TQ
The main number may be the end of the story, or it may required a TQ code in which case the website provides the TQ field to be filled in and also validates if the provided TQ is correct. So, if TQ is required you need to go through the additional steps, if TQ is not required, the website will not ask for it and just provide the results. All records are needed.

Re: wait
The website responsiveness is inconsistent so I enhanced the wait time with Wait node. For now, I want the main parts of the flow to work. I will look at playing with Wait for after that. Thank you for your reply!

At Node 123 (3rd node in the scrape results portion) the flow finds each values of the innerhtml separately, thus we have many find/extract pairs. I am wondering if we can replace it with one Find and Extract and then deal with row to columns transpormations?

Currently this is the way of doing it. I know that extracting several values is a bit clumsy, and we have some ideas how to provide a “convenience node” which combines several extraction steps in the future. Stay tuned.

Looking forward to some changes :slight_smile:
Do you have any feedback on why the workflow does not write variable for all rows? or how to handle the TQ exceptions and proceed?

If I understood correctly, this is the case in row 2 (zero-index) of your sample data? The issue in this case lies here:

“Find Elements” returns an empty table, so all following steps are basically skipped. You’ll need to handle the case of an empty table (i.e. no TQ input requested from the site), and in this case make sure, that you fall back to the previous table (i.e. the input to the highlighted “Find Elements”).

Does this make sense?

1 Like

Not really. I am thinking that I am relying on the ‘missing result’ and no exceptions. So, I am assuming the workflow should record the result as missing, not return empty.
I have updated the workflow to handle TQ exception (rather awkwardly judging by the time the workflow hangs around that node before proceeding) but at least it does not halt.
I am still not writing data for non-TQ rows and not sure how to fix it. Please help. Maybe updating the workflow will be easier than explaining it to me, I am fine with that too :slight_smile:

Commodity Tax Validator - QST.knwf (119.9 KB)

1 Like