Help with partitioning >2million rows into groups of 100's

badger101 · March 29, 2022, 2:21pm

Greetings. I am trying to partition over 2 million rows of data into groups of 100’s. The straightforward way I know how is to use the Partitioning nodes (as shown in the attached screenshot), but because my data is huge, it would mean I’ll have to do it more than 20k times! Is there any way I can do this efficiently?

Thank you!

bruno29a · March 29, 2022, 2:23pm

Hi @badger101 , you can use the Chunk Loop

It’s made for that. You can define either the number of batches, or the number of rows/records per batch.

In your case, you want to set it to 100 rows per batch, like this:

badger101 · March 29, 2022, 2:27pm

Thank you. I have looked at several examples of workflows that use that node. It seems like it needs its partner which is the Loop End node right? Can you help me with the positioning of the nodes within the context of my own workflow?

bruno29a · March 29, 2022, 2:31pm

Hi @badger101 , yes, any loop needs to have a start and end.

For example:

So, that would come either after the Duplicate Row Filter or the Math Formula, depending on why you are giving each of them a count number.

FYI, it’s better to use the Counter node to generate a count number, or you can use my component that generates an auto increment number in a column for you
Component:

Component demo:

badger101 · March 29, 2022, 2:41pm

Okay let me test the Loop nodes. I’ll come back here later to give an update

badger101 · March 29, 2022, 5:05pm

Thank you. This is the solution for what I was asking. I will mark it as Solved. Although I noticed now my end goal isn’t achieved cause I might be asking the wrong question, hence I’ll make a new topic.

mlauber71 · March 29, 2022, 5:13pm

@badger101 you could try and adapt this workflow by calculating the number of chunks and exporting the result to separate files instead of one single CSV file.

bruno29a · March 29, 2022, 5:36pm

You are most welcome @badger101 .

Yes, I’m assuming you most probably want to know what to do next

But as far as creating batches is concerned, it’s been answered here. I look forward to your question in the next thread, and hopefully we can help you there.

system · April 5, 2022, 5:37pm

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.