Building matrix using file information

I have a file with two columns. the first one contains the objects while the second one one or muliple information (comma separated).
i need to generate a matrix in which the column are the objects and the first row all the possible information. For each object, if it matches the infomration in the input file 1 will be the corresponding score otherwise will be empty cell.

An example:

Input:
input.txt (170 Bytes)

Substarte_name Enzymes
Diazepam GSH
CHEMBL407009 ADH5, AKR1B10, AKR1C1, ALDH1A3, CBR1, CBR3, HSD11B1, NQO1
Valsartan GSH
Donezepil GSH
brenda_212032_CID_667436 UGT

expected output_matrix:
out_matrix.txt (196 Bytes)

GSH ADH5 AKR1B10 AKR1C1 ALDH1A3 CBR1 CBR3 HSD11B1 NQO1 UGT
Diazepam 1
CHEMBL407009 1 1 1 1 1 1 1 1
Valsartan 1
Donezepil 1
brenda_212032_CID_667436 1

Could someone suggest a flow to do that?
Thanks in advance!

Hello @tommasopalomba,

Cell Splitter node on your Enzymes column with comma as delimiter followed by One to Many node should do the trick. Give it a try!

Br,
Ivan

3 Likes

Hi @tommasopalomba

@ipazin route is a good one. Another route is to unpivot and pivot your data. See
buiding_matrix.knwf (39.4 KB)
image
gr. Hans

4 Likes

Tnx @HansS!
You don’t need Column Filter after One to Many node as it has option to remove included columns from output. Also Cell Splitter has same option. Sry but I just can’t help it as it annoys me seeing Column Filter being the most used node in KNIME :sweat_smile:
Ivan

1 Like

Oeps, sorry @ipazin for that :pray: you are right (of course).
In that case your solution is the way to go.
KNIME has far too many choices and possibilities to come up with a solution.
gr. Hans

1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.