I have a
Set column that contains a list of IDs:
My goal is to create a table with all pairs of IDs in each set:
Operation in pseudo code:
for row in table:
for element in set:
return (element, element+1)
Is there a way to achieve this with built in nodes or do I have to write a Java snippet? Would also be great if someone could share a few ideas how to do that in a snippet.
Could you please post the data shown in your snapshot in text format ? I’ll take it from there and provide a solution.
Thank you @aworker!
Here is a CSV file that contains a comma seperated list of IDs. The first column contains an identifier for the list of IDs.
ids.txt (84.1 KB)
My pleasure. Would this do the trick ?
For instance, from this list:
to this result:
20220120 Pikairos Create pairs from a list or set.knwf (1004.2 KB)
Hope it helps.
Absolutely fantastic, thank you @aworker.
Step by step I learn to think in tables and joins
Is it possible to filter the output table for unique pairs? I.e. delete 5-23 if we have 23-5?
Yes indeed, it is quite easy
The trick here is to filter out pairs of “doublets” based on Alphabetic order. Since they are all duplicated, one of the two pairs is in alphabetical inverse order and hence, you can filter out the one that is alphabetically “bigger”, for instance:
11364 34 (is removed)
34 11364 (but this one is kept)
Then you need to make sure that only one instance of “self-pairs”, for instance
34 34 is kept too, because normally there should be two. This is done using the -Duplicate Row Filter- node.
20220120 Pikairos Create pairs from a list or set without doublets.knwf (3.0 MB)
Since you are working with gene sequence IDs, I guess your aim is to build a non-directed relational graph and for this you just need one edge between two related genes if the graph is not directed. But I’m just guessing or anticipating what maybe you want to eventually implement ?
Thanks for your kind comments and for having validated the answer !
Thank you, the only challenge is my IDs are not numbers so I cant do the Rule based Row Filter on => any ideas?
That really is a beautiful solution.
Here neither, they are strings and a " > " comparison works fine with strings here.
This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.