Python Script adaptation

VAGR_ISK · November 21, 2022, 7:27pm

Hi,
I am trying to adapt the below python code to the Phyton Script node in KNIME, but I got short on ideas since I am not a programmer. Any help will be appreciated:

Code:

import pandas as pd
import numpy as np
import ppscore as pps

Here I am trying to put the whole input table*

df = pd.DataFrame(*)
df.head()
mat = pps.matrix (df)
mat

I want to send the results in “mat” to the output table.

Thanks in advance,

AG.

gonhaddock · November 21, 2022, 7:34pm

Hello @VAGR_ISK
Have you already tried ?

output_table_1 = pd.DataFrame(mat.copy())

BR

mlauber71 · November 21, 2022, 10:15pm

Your mat item is a dictionary. You will have to convert it to a pandas data frame before bringing it back to KNIME.

output_table_1 = pd.DataFrame.from_dict(mat, orient='index')

VAGR_ISK · November 22, 2022, 6:28am

Hello,

Thanks for the rapid feedback. Actually, there is more than one question there. First, I am not sure how to import the content of the whole table. Normally, at the left side of the python script node, you see the columns that are in the table and you can reference the individual columns. However, in this case, I need to reference the whole table and send it to pandas. In the previous post, I added an “asterisk” to indicate where I “think” I need to reference the whole table: df = pd.DataFrame('*). Thus I don’t know what to put instead of this asterisk.

Once I manage to import the whole table to pandas, I will definitively try the line of code you gave me to output the results obtained in mat item.

Let me know if you understand my question, as mentioned my knowledge is limited in programming.

Thanks,

AG

mlauber71 · November 22, 2022, 7:04am

This sounds more like a python question. In python you have different data structures. Often numpy arrays or dictionaries and the like. Knime would need pandas data frames (https://pandastutor.com/) or in later versions arrow tables to transfer data directly.

You might want to inform us what the status of your workflow is and what you have been trying to do.

Here is an example you could adapt.

About KNIME and python in general you could start with the bundled version (https://www.knime.com/whats-new-in-knime-46#bundled-python) and see this example. More examples are in the hub.

VAGR_ISK · November 22, 2022, 9:17am

Wooaaaoo CONGRATULATIONS the simple example that you made is frankly super cool.

Regarding the PPS score (Predictive Power Score Implementation in Python | by @lee-rowe | Geek Culture | Medium), your presumption is correct. I have a series of genes (columns) and several conditions (rows) and I want to identify non-linear correlations between those genes using PPS, by contrast to linear correlations using the Pearsons or some others, that I know are available as KNIME nodes.

I was also following the same article that you referenced here. Although I guess, I could run the script in a Jupyter notebook, I would like to do it directly in KNIME, since I am doing everything in KNIME. However, I always get stuck on how to import the data in the phyton nodes and how to output the results in a table format to continue working in KNIME with the results.

I attached a file, perhaps you could see it more simply in that way.

Example.knwf (18.3 KB)

Cheers and thanks in advance,

AG.

mlauber71 · November 22, 2022, 9:29am

@VAGR_ISK I will take a look later. You might want to familiarize yourself with the working sof KNIME and Python first. Besides the official guide (KNIME Python Integration Guide) the new Space might be helpful showing you small examples how these things work

For example how you put data in (Input Data Tables with Python Script (Labs) – KNIME Hub) and out (Output Data Tables with Python Script (Labs) – KNIME Hub) of Python nodes.

DiaAzul · November 22, 2022, 11:15am

@VAGR_ISK

If I understand you correctly you want to import the entire KNIME table into the Pandas script. This is straightforward, as shown in the image below.The input variables in this case show three tables, each identified by the index 0, 1, 2. Table with the sub-script [0] has two columns A, B.

Screenshot_20221122_110916

To import the entire KNIME table on the first input port into the Python DataFrame df you would use the following to import all of the columns and data:

import knime_io as knio
df = knio.input_tables[0].to_pandas()

DiaAzul

mlauber71 · November 22, 2022, 5:02pm

@VAGR_ISK the results are stored in a dictionary that would will have to convert in oder to use it in KNIME. For example you will have to convert all columns to string since KNIME would not allow a mixture of types in one column. You could later convert that back to double if you need.

import pandas as pd
import numpy as np
import ppscore as pps

mat = pps.score(input_table_1, "x", "y")

output_table_1 = pd.DataFrame.from_dict(mat, orient='index', columns=['values']).transpose().applymap(str)

output_table_2 = pd.DataFrame.from_dict(mat, orient='index', columns=['values']).applymap(str)

kn_forum_49209_python_ppscore_pps.knwf (47.1 KB)

system · December 26, 2022, 8:21am

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.