Hi,
I am trying to adapt the below python code to the Phyton Script node in KNIME, but I got short on ideas since I am not a programmer. Any help will be appreciated:
Code:
import pandas as pd
import numpy as np
import ppscore as pps
Here I am trying to put the whole input table*
df = pd.DataFrame(*)
df.head()
mat = pps.matrix (df)
mat
I want to send the results in “mat” to the output table.
Thanks for the rapid feedback. Actually, there is more than one question there. First, I am not sure how to import the content of the whole table. Normally, at the left side of the python script node, you see the columns that are in the table and you can reference the individual columns. However, in this case, I need to reference the whole table and send it to pandas. In the previous post, I added an “asterisk” to indicate where I “think” I need to reference the whole table: df = pd.DataFrame('*). Thus I don’t know what to put instead of this asterisk.
Once I manage to import the whole table to pandas, I will definitively try the line of code you gave me to output the results obtained in mat item.
Let me know if you understand my question, as mentioned my knowledge is limited in programming.
This sounds more like a python question. In python you have different data structures. Often numpy arrays or dictionaries and the like. Knime would need pandas data frames (https://pandastutor.com/) or in later versions arrow tables to transfer data directly.
You might want to inform us what the status of your workflow is and what you have been trying to do.
Wooaaaoo CONGRATULATIONS the simple example that you made is frankly super cool.
Regarding the PPS score (Predictive Power Score Implementation in Python | by @lee-rowe | Geek Culture | Medium), your presumption is correct. I have a series of genes (columns) and several conditions (rows) and I want to identify non-linear correlations between those genes using PPS, by contrast to linear correlations using the Pearsons or some others, that I know are available as KNIME nodes.
I was also following the same article that you referenced here. Although I guess, I could run the script in a Jupyter notebook, I would like to do it directly in KNIME, since I am doing everything in KNIME. However, I always get stuck on how to import the data in the phyton nodes and how to output the results in a table format to continue working in KNIME with the results.
I attached a file, perhaps you could see it more simply in that way.
@VAGR_ISK I will take a look later. You might want to familiarize yourself with the working sof KNIME and Python first. Besides the official guide (KNIME Python Integration Guide) the new Space might be helpful showing you small examples how these things work
If I understand you correctly you want to import the entire KNIME table into the Pandas script. This is straightforward, as shown in the image below.The input variables in this case show three tables, each identified by the index 0, 1, 2. Table with the sub-script [0] has two columns A, B.
To import the entire KNIME table on the first input port into the Python DataFrame df you would use the following to import all of the columns and data:
import knime_io as knio
df = knio.input_tables[0].to_pandas()
@VAGR_ISK the results are stored in a dictionary that would will have to convert in oder to use it in KNIME. For example you will have to convert all columns to string since KNIME would not allow a mixture of types in one column. You could later convert that back to double if you need.
import pandas as pd
import numpy as np
import ppscore as pps
mat = pps.score(input_table_1, "x", "y")
output_table_1 = pd.DataFrame.from_dict(mat, orient='index', columns=['values']).transpose().applymap(str)
output_table_2 = pd.DataFrame.from_dict(mat, orient='index', columns=['values']).applymap(str)