Unpivot varying number of columns

Hi Everyone, I have a distance matrix generated from outside of KNIME and I’m trying to get it into a unpivoted list of pairs. The format for each line is the follows…

Col1 Col2 Col3 Col4 Col5
QueryID1 MatchID1 MatchDist1 MatchID2 MatchDist2
QueryID2 MatchID1 MatchDist1 MatchID2 MatchDist2

and so on where there are multiple rows and columns with an unknown number of “Matches” ahead of time.

I can unpivot If I choose Col1 for Value and Col2 Col3 for Retained then pull a second unpivot node and Choose Col1 for Value and Col4 Col5 for Retained and Concat the results. and so on. This is a large matrix where each query has a different number of matches.

Any idea how I could do this? I thought I could pass columns and unpivot and iterate through the table but I don’t think there is a group loop for columns…

Any help, ideas, suggestions related or unrelated to my thoughts would be greatly appreciated.

Thanks,
Jason

Maybe you could use the Table Row to Variable Loop Start node? I was thinking maybe you could feed it a table of the columns you want to retain in each iteration, and then pass those columns as flow variables to the Unpivot node…

Hi @ScottF ,

Thanks for the reply. I think I see where you are going but I haven’t had luck. I can pass one row with a Chunk loop then into a row to variable loop start but there is no way to index through the columns for choosing in the pivot node. I’ve attached the matrix and my initial attempts. Let me know what you think.

Thanks,
JasonIterate_unpivot.knwf (15.6 KB)

Here’s something I came up with. This does unpivoting for the first two columns of interest - in the first iteration of the loop, that’s column2 and column3 - then renames those columns and appends results from each iteration in the Loop End node.

Setting the flow variables properly is a little tricky. You will have to drill down inside the Unpivoting and Column Rename nodes to see how exactly I changed them.

The end results are a bit messy and can be tightened up, but since I wasn’t 100% sure I was solving your problem effectively, I’ll leave that to you :slight_smile: The other bit that can be improved is automating the detection of the column names - right now they are manually included in the 2nd Table Creator node, which isn’t ideal of course.

Let me know if this helps.

UnpivotingOverColumnLoopExample.knwf (18.9 KB)

1 Like

Hi,
I have a new Idea as I think the solution by dear @ScottF is not fully automated and you have to input a list of paired columns (Table Creator - Column list):

In this workflow I used a “Table Specs” node to find the number of columns (as you mentioned the number of the pairs of MatchID and MatchDist is unknown to you) then I calculated the required number of loops by subtracting 1 from the column count and dividing it by 2 and then this number is fed to the “Counting Loop Start” node.
By using the current iteration number I produced the column names in those “String Manipulation (Variable)” nodes:
1st expression:
join("column", string(2*($${IcurrentIteration}$$+1)))

2nd expression:
join("column", string(2*($${IcurrentIteration}$$+1)+1))

I assumed that the column names are like column1, column2, column3,… but if this is not true you can modify the expressions.
I used the “Unpivoting” node as you expected (the retained columns are determined by the variables) and then renamed the retained column names by using the the “Column Rename” node (again by using the variables for old column names) and closed the loop.

Please Check the workflow and let me know if it’s working fine:
unpivot.knwf (45.7 KB)

Best,
Armin

5 Likes

Thank you @ScottF for the time and effort. Apologies I’m just getting a chance to take a look!

2 Likes

@armingrudd Thank you so much. Going to take a peek very shortly. I really appreciate your and @ScottF 's efforts! :slight_smile:

3 Likes

@armingrudd I am trying this solution but unlike the workflow attached I am not getting array option for included_names in Flow variables tab of unpivoting tool. I already tried adding more than one columns to Retained Columns option in normal tab but that also did not work. How to activate arrays option for included_names in Flow variables tab? Can you advise.

Hi @amitvirat,

Can you attach your workflow here?

:blush:

sure

unpivot-no array.knwf (26.4 KB)

Well, the issue is that you don’t have assigned columns names to retained_coumns > included_names. I know you have followed the approach in my workflow, but since then a new update has come which added array type to flow variables and the option in the node also has changed. So you have to pass included names as an array instead of assigning them one by one. This means the workflow should be reworked. Just to show you how this can be done, here is the modified workflow but it’s not very neat.

unpivot-no array.knwf (57.2 KB)

:blush:

2 Likes

okay… that’s why I saw a lot of solutions having this array thing for flow variables but no matter what i tried I was not getting the option. I thought it might be with version but was not sure and wanted to check. Thanks for your help.

1 Like