How to solve this Joiner error in loop operation?

Hello, there

I have two file groups, both are including several hundred CSV files.
Now, my task is simply to merge two files from each group, and then export the result to new files.

The solution would be to use a Joiner node to merge with a key column in a dual loop.
But I don’t know how to create and control a dual loop to do this operation.

As a test I simply created two loop (not dual) to read files from each group, and then try the Joiner.
This operation receives the following error.

WARN Joiner 3:41 Unable to merge flow object stacks: Conflicting FlowObjects: <Loop Context (Head 3:40, Tail unassigned)> - iteration 0 vs. <Loop Context (Head 3:34, Tail unassigned)> - iteration 0 (loops/scopes not properly nested?)

The test workflow is like this:
WS000000

Please give me a tip of how to tackle it. Thanks!

Hi there,
if I’m right you have to extend your workflow with a loop end node between each csv reader node and the joiner node.

This can solve your problem.

Br
Hermann

1 Like

Hi @morpheus,

Thank you for your response.

You are right, the error should be solved if I end the loop before Joiner node.
But I have anther thing to worry about.

Each group contains csv files larger than over 4GB. The total file size is very huge.
If I end the loop right after CSV reader, I think all files will be operated on memory here.
Is that going to be operated properly? :sweat_smile:
I’m not sure about that, so currently I’m considering finding a way to do the merge in loop, operated file by file.

Hi @qianyi,

it depends on what you want to join. Do you want to join the content of the files or only files with the same filename independent on their content?

Hi @morpheus,

I need to join the content of the files.

The target is to merge the columns from each file. The merge key is a Date&Time column which has been prepared in each file. I don’t know whether there is other nodes instead of Joiner can do this or not.

Maybe to end a loop for only one CSV Reader (in a dual loop) would properly work well?

Hi @qianyi,

Maybe you can solve it with a workflow configuration like the attached image.

1 Like

Hi @morpheus and @qianyi,

I don t know the structure of both csv files. But if the total records in both files matches and they are in the same order on your key-column then you can use a Column Appender node instead of a Joiner node. This will speed up the matching part.

Hi @morpheus,

Thanks for your reply.
I have tried what you have suggested and finally failed. But I think perhaps it can be done if I could make some change to the workflow.

I will prepare for uploading my workflow later with a part of data (which has been normalized), help to understand the data and the workflow operation.

Hi @HansS,

Thanks for your advice.
The total record and the key column are different, so Column Appender should not help in this task.
Please find the workflow with the data which I will upload in next reply.

I realized this prolem is not only a Joiner node problem, it is a large file merge issue.
So I uploaded my workflow in a new topic here:
https://forum.knime.com/t/how-to-merge-large-files-without-memory-overflow-in-knime/15386

Please see more information and give your advice there.

Thanks again!

This topic was automatically closed 182 days after the last reply. New replies are no longer allowed.