Call Workflow Execution Error

priyanshijain98 · December 11, 2020, 9:39am

Hi,

I am trying to use Call Remote Workflow(Row based) node to call other workflow. In this sample example, I am just trying to call a workflow and save the output file from the called worflow in our hdfs. I have added pictures of:

call_workflow
call_worflow988×273 27.3 KB
called_workflow

After the execution as show in the call_worflow, when I tried to access that dataframe, the resultant dataframe was something like this,

I can’t figure the reason behind this problem. Instead of this empty set, there should be a dataset of 1000 rows.

AlexanderFillbrunn · December 11, 2020, 2:18pm

Hi,
Have you set “From Column” as the input data to be sent in the Call Remote Workflow node? For it to appear you need to click the “Load input format” button.
Kind regards,
Alexander

priyanshijain98 · December 12, 2020, 4:09am

Hi,

I did that and this time called workflow got executed but I think all the rows are getting overwritten on each new row because the final output file has only one row which is the last row of the input file.

AlexanderFillbrunn · December 14, 2020, 8:16am

Hi,
In your Spark to Parquet node, did you specify what to do when the file already exists? I think you are looking for the “Append” option here and currently the workflow is simply overwriting the file.
Additionally, you might want to have a look at the Call Workflow (Table Based) node instead of the row based one. You can simply send the whole whole table and then write it all to the Parquet file at once. Be aware, though, that this is only feasible for data sizes up to around 100MB.
Kind regards,
Alexander

priyanshijain98 · December 17, 2020, 2:52pm

So since I am working with Big Data, so using Call workflow nodes won’t work? If not, then what is the alternative?

AlexanderFillbrunn · December 17, 2020, 3:32pm

Hi,
I don’t think this is the case. You can just turn on the append option in your Parquet Writer and it should output more than one line.
Kind regards,
Alexander

priyanshijain98 · December 18, 2020, 10:35am

I am asking in Call workflow regard. suppose I want to call a workflow using call workflow node. So, is there any input data limitation that is being passed through Call Workflow node to the workflow that I called.

AlexanderFillbrunn · December 18, 2020, 10:48am

Oh, I see! Sorry for the confusion. Yes, the “soft limit” is around 100MB. If it is more, it makes more sense to upload the data to the server using the normal file handling nodes (e.g. File Upload before KNIME 4.3 and Transfer Files with KNIME 4.3). After the upload you can then access the data in your workflow using a KNIME Server Connector node.
Kind regards,
Alexander