Try-Catch with multiple catches?

Omega · June 2, 2020, 4:50pm

Hi,
this is my first question in this community and I hope, I have chosen the correct place and topic for it. Otherwise, please give me a hint where to place such a question, thank you very much.

I have a question regarding try-catch constructions, especially w.r.t. multiple catch branches. I have created an example workflow attached to this post. The basic idea of that workflow is as follows: I want KNIME to list all xml files in a given directory and then loop over them in order to read each file and write some of its content into two separate database tables. All xml files have such a structure:

<example_data>
	<aaa>value1a</aaa>
	<bbb>value1b</bbb>
	<ccc>value1c</ccc>
</example_data>

The workflow works correctly as long as my input xml files are correctly filled. However, there might occur errors when trying to write the data into one of the database tables. In order to simulate this, let us assume the database tables do not accept NULL values. As long as all my values are non-NULL, everything works fine. When aaa is not filled, the upper DB Insert Node (for table1) fails, but table2 can be filled correctly. Vice versa, when ccc is not filled, the lower DB Insert Node (for table2) leads to an error, but table1 can be filled correctly. I need a catch-construction hat catches both possible errors.

In the example workflow, I have created some own kind of “error logging”. For each file, some log information is written into a (third) database table. In case of any error, the catch-node should forward the error information and the subsequent nodes write that information to that log-table.

My problem: In my example workflow, the catch-node only reacts to errors w.r.t the upper branch (table1). Errors w.r.t. the lower branch (table2) are not detected and therefore do not show up in my log-table. I need a way to somehow give both DB insert outputs as input to the catch-node. Or do I need two catch-nodes (for only one try-node)?

How can I construct a try-catch construction for this example workflow that has the following properties:

Independent of whether any errors occurs or not, the loop always loops over all files in the directory. (That is, when an error occurs, the workflow does not simply stop, but continues with the next file.)
When an error occurs either w.r.t. to table1 or w.r.t table2, the error is catched by a catch-node such that it gets written into my log-table. If possible, it would be preferable to also log some information where the error(s) occurred (e.g., by getting their node ID(s)).
When no error occurs, the log-table still gets a new line, but it only shows “-“ in the error-columns.

How can I achieve this?

Thank you very much for your help!

try_catch_example.knwf (49.7 KB)
minimal_example_data_1.xml (94 Bytes)

By the way, I’m a KNIME beginner, so my workflow might be overly complicated. If you have any suggestions how to simplify it (for example, w.r.t. the loop-end node), I’m happy to hear about it, too. Thank you.

Omega · June 3, 2020, 7:59am

Here’s a screenshot of my KNIME workflow. Maybe that helps to get a first impression. The full workflow is attached in the original post.

AlexanderFillbrunn · June 3, 2020, 8:03am

Hi,
Can’t you simply connect the bottom DB Insert to the top one with a flow variable connection to execute both in the same branch leading to the Catch node? Since the two DB Inserts share a connection, KNIME will only execute one at a time anyway, I think.
Kind regards,
Alexander

PS: And welcome to the KNIME Forum! This is definitely the right place and topic for this question

Omega · June 3, 2020, 8:48am

Hi Alexander,
thanks a lot for your answer. Do you mean this?

In this case, the lower DB Insert (table2) is always executed before the upper DB Insert (table1). Consequence: If the lower DB Insert fails, the upper DB Insert is not executed any longer.

Unfortunately, this does not solve my problem. Maybe, I have not been precise enough. What I need is:
a) If both nodes work, there is not error.
b) If only the upper node fails, the lower node should still be executed and the failure w.r.t. the upper node should be catched and logged.
c) Vice versa: If only the lower node fails, the upper node should still be executed and the failure w.r.t. the lower node should be catched and logged.
d) If both nodes fail, both errors should be catched and logged.

Your solution satisfies (a)+(b), but not ©+(d).

AlexanderFillbrunn · June 3, 2020, 9:01am

Hi,
What if you wrap each into a try-catch-block independently?
Kind regards,
Alexander

ipazin · June 4, 2020, 2:47pm

Hi there @Omega,

DB Insert has option not to fail on error and continue. Also it offers to append status columns so maybe you don’t need Try/Catch sequence in this case but rather you should check status column and based on it fill your log DB table with appropriate content?

Br,
Ivan

Omega · June 5, 2020, 1:59pm

Hi,
thank you very much for both your answers. I did not fully check your approach, Ivan, but I will do so. Maybe that helps. Nevertheless, I would love to furthermore get a better understanding of the try-catch construction in KNIME. I have some general questions:

1.) Does a KNIME workflow always require the number of try-nodes to be equal to the number of catch-nodes? Or is it possible to use one try-node with multiple catch-nodes? Similarly, is it possible to use multiple try-nodes with only one catch-node? If yes, are such constructions advisable or should they be avoided?
2.) Is it possible to nest multiple try-catch-constructions within each other? If yes, how does a catch-node know to which try-node it belongs to?

Maybe, I simply do not understand and use the catch-node correctly:
3.) In my original workflow, I need two auxiliary nodes (between catch and loop end) just for closing the loop. Isn’t there a smarter way to directly close the loop after the catch-node such that the next iteration starts independently of whether there was an error or not in the inner part of the loop?
4.) If I directly combine catch and loop-end via the output port (port 0) of the catch-node, I get a runtime error “Active Scope End node in inactive branch not allowed.“ How to use the output port of the catch-node correctly? The description says “Original input or default if execution failed.” Shouldn’t this imply that the output port can still be used in case of an error catched by the catch-node? Why can’t I connect it to the loop-end node then? Where can I see or specify the mentioned default values?
5.) In my original workflow, my logging will happen for each line of the loop. Suppose, I only want to log lines in case of an error previously caught be the catch-node. How could I do this? Of course, I might add an additional IF-node behind the catch-node to first check the value of the “FailingNode”-variable, for example. However, this seems to be overly complicated, isn’t it? I guess there is a simpler way to tell the catch node “if no error occurred, do step A, if an error occurred, do step B”.

Regarding your solution, Alexander, this should technically work. If I understand you correctly, I would need two try-nodes and two catch-nodes. Can you please help me to implement your solution? I have started as can be seen in the following screenshot. However, what would be the second input (“default input”) for these try-nodes? Can I use simply the same input for both input ports of the catch-node or would this be stupid? How can I finally bring the two catch-nodes together then to be able to close the loop properly? And how can I connect both catch-nodes to my logging logic in the yellow box without doubling all these nodes? (In my actual workflow, the logic in the yellow box contains several more nodes and, moreover, I do not only have two DB Insert nodes, but up to ten of them. Therefore, I’m looking for a simple solution that does not require too many extra nodes as this would quickly explode…)

Again, let me thank you very much for your helpful comments and suggestions!

AlexanderFillbrunn · June 5, 2020, 2:25pm

Hi,
for question 1: Each try needs a catch, it should always be a 1:1 relation.
Question 2: I am not sure about nesting, but I doubt that it will work. What should happen, anyways? Any error occurring in the inner block will be caught there and cannot be propagated to the outer catch. So instead you can just use multiple blocks for multiple sections of your workflow.
For 3.: Do you need to collect any data in the loop end? If not, you can use the Variable Loop End instead.
4.: I think that’s because the output is inactive if there was an error and the loop end cannot handle it. You can use an End IF node and an Active Branch Inverter to create a fallback for an inactive table port.
5.: I think the IF-solution is the way to go here.

As for my solution, it may be enough to connect the flow variable port of one catch node to the hidden top left port of the next try node. However, you should probably first rename the flow variable with the failing node using a String Manipulation (Variable) so it does not get overwritten in case the second insert also fails. With 10 inserts this gets a bit tedious of course. Maybe you can put the logic into a parameterizable component and call that in a loop? The component has a table input and some parameters such as the table name and outputs a table with results. Inside you can use a try-catch block. In a loop, you go through your list of parameters and call the component every time, collecting all the log output in a normal loop end. Does that make sense? Otherwise I can create a simple example workflow.
Kind regards,
Alexander

Omega · August 27, 2020, 12:46pm

Thank you for your help and sorry for the late reply!

system · February 26, 2021, 12:47am

This topic was automatically closed 182 days after the last reply. New replies are no longer allowed.