I am running KNIME AP 4.3 and Server 4.12 on a distributed executor setup in AWS.
I have Webportal workflow which can occasionally fail late in the process. When this happens there seems to be a disconnect between the job status as listed in the workflow view and reality. Specifically, the job is shown as successfully executed even when it is not.
In this case it is only mildly annoying, but I can imagine it could be quite serious if not detected and workflows were assumed to have completed.
That is definitely frustrating. Could you share the error that is occurring in the component?
One question that comes up to my mind: are there any loops inside of your component?
Hi Ana, Wali,
@wkhan I believe that the error is a result of the job having been cached to disc and a (legacy) temporary directory not being restored but I haven’t had a chance to look deeper (currently on paternity leave). @ana_ved yes, there is a loop there.
I had one small hint I noticed right before I closed my work computer last night which is that the job had a state of SUCCESS when looking at it in the home directory of the workflow, but FAILED_WITH_CONTENT or something similar when looking at it from the admin/jobs page.
@ana_ved was kind enough to send me a note offline. We think this is a bug where the WebPortal doesn’t display the job state correctly when loops fail sometimes. We’ve opened a bug for this so thanks for letting us know. The ticket is WEBP-838 for your records.