Converting rows into columns - but there's a catch!

Hi. I have a file enlisting more than 2 millions of items, where each row/line represents one item. The items are the only data, hence upon converting the list to a table format, the table only have one column.

e.g.
Animals
-Dog
-Cats
-Giraffe
-Orang-utan
-Centipede
-Starfish
and so on .


My end goal is to give it a structure. The end structure should follow this format:

Column 1 | Column 2 | Column 3 | ........ | Last column
  • Dogs - Ants - Bees … …
  • Cats - Milipede - Lion … …
  • Giraffe - Bird - Tigress … …
  • Orang-utan - Duck - Jellyfish … …
  • Starfish - Beetle - Mantis … …

The rules are such that each column must have only 100 items (i.e. 100 rows), with the exception of the last column. Orders are not important. And column names are not categories, since my end goal is to give the data a structure rather than a label.


I have tried to approach my problem by using the Chunk Loop Start Node with a setup of 100 rows. But I ended up with two columns only (Column 1 was the items, and Column 2 was the Iteration Number). From the result, I have tried many nodes such as Pivot, Unpivoting, GroupBy etc. but it seems like I cant get to my end goal with that approach.

Please help. Thank you!

Hi @badger101 ,
hope this workflow can help
KNIME_project.knwf (212.2 KB)

6 Likes

Hi @badger101 , this should work:
image

I defined a variable for max number of rows per column, so you can configure it. In my demo, I set it to 5 since I don’t have a lot of data (you would set it to 100 in your case):
image

Here’s my sample data:
image

It has 16 rows, so configured with max rows as 5 per column, we should end up with 4 columns, with the 4th column containing only 1 row (16 - (3 x 5)).

Result:
image

And if I play around with the setting, for example, setting the max rows to 2:
image

Results:

We get the expected 8 columns, each with 2 rows.

Here’s the workflow: Move rows into columns with defined max rows.knwf (14.7 KB)

EDIT: The Counter Generation is not needed. We can use RowID directly without it:
image

Here’s the updated workflow: Move rows into columns with defined max rows.knwf (13.3 KB)

1 Like

@bruno29a and @duristef … thanks for the replies. I have just downloaded and installed the latest version of KNIME a few minutes ago. Since I have not exported my previous workflow to my desktop, I have no backup. The only saved items I found are the nodes I used in that workflow:

It seems I have to restart over again before I can proceed with testing the solutions you guys proposed. I’ll revisit this project of mine tomorrow. Anyways, really appreciate this!

Nice one @duristef , I like what you did. Nice use of the Column to Grid there

Hi @badger101 , if you make your new version of Knime point to the same workspace where your old version was pointing, you should see your workflows in your new Knime.

1 Like

Thank you, @bruno29a! As usual, different solutions are proposed for the same problem. For me this is a great way to learn fresh approaches and new techniques

1 Like

Oh yes that works like a charm!

It’s already 3 AM here so I’ll be continuing tomorrow. Have a good day!

I’ll definitely look into both solutions tomorrow and will give an update :ok_hand:

Thank you! Both solutions worked. Because @duristef 's workflow doesn’t require a Loop Node, I don’t have to wait for iterations to finish the process, hence I’m choosing it as the Solution. I’ve never used the “grid” node before, so I learned something new today!

2 Likes

Hi @badger101 , yes, I would go for @duristef 's solution. It’s definitely much faster than the loop.

Likewise, I never used the Column to Grid node, but come to think of it, the name of the node is exactly the action you are trying to do (a grid is basically a “table” with multiple columns and multiple rows)

1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.