From edge list to adjacency matrix

clusty · January 21, 2019, 2:54pm

Hello,

I’m trying to convert an edge list into an adjacency matrix for later use with data mining algorithms. The edge list looks like the following:

ID,value
1,2
1,3
2,1
2,2

And I would like to get:

ID,val_1,val_2,val_3
1,0,1,1
2,1,1,0

I tried with the “one to many” node but it duplicates the rows while I would like to use ID as a unique identifier.

Any suggestion?

Thanks

armingrudd · January 21, 2019, 3:12pm

Hi,

In “One to Many” node, include the “value” column and check the “Remove included columns from output” option. Then use a “GroupBy” node and select the ID column as the grouping column and for aggregation choose “Maximum” for all value columns (you can use “Type Based Aggregation”).

Best,
Armin

solution

clusty · January 21, 2019, 8:23pm

Thanks Armin, it works. One issue I have is that the data set has a high dimensionality (e.g. 100k rows, 5k columns). Therefore, the “one to many” node is time consuming. While I will use PCA/SVD afterwards to reduce the number of dimensions, I was wondering if there is a way to make it run faster.

system · January 28, 2019, 8:23pm

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.