Python Script unsupported DataType issue.

Hi KNIME Support

I am not getting any results from my Python Script because it is an unsupported DataType. The code is working successfully in Jupyter Notebook.

I need to put all the values that exist in the form of a dict in one column of a DataFrame. When I print inside the KNIME’s Python Script node, I get the same result as in the Jupyter notebook, but when I execute the node, I get the following error and the KNIME Python Script I used is shown below.

I’d be grateful for an answer.

Hi @JaeHwanChoi,

What format exactly would you expect to be stored inside the KNIME table if there’s a column containing a dict?

If you expect a JSON value, then there’s a simple solution: Upgrade to KNIME 5.1 :wink: We have added support for JSON (= Python dict) and XML (= Python’s xml.elementtree.ElementTree) cells in Python in the last KNIME release.

Best,
Carsten

4 Likes

Hello @JaeHwanChoi

A solution using KNIME 4.7; You can alternatively take a look to this post:

Py Script node:

import json
import knime.scripting.io as knio

import pandas as pd

dict_df = {'model_1': {'indicator_1': 0.65
            , 'indicator_2': 0.63
            , 'indicator_3': 0.88}
        , 'model_2': {'indicator_1': 0.83
            , 'indicator_2': 0.76
            , 'indicator_3': 0.93}}

json_string = json.dumps(dict_df)

df = pd.DataFrame([{"dict_detail": json_string}])

knio.output_tables[0] = knio.Table.from_pandas(df)

Adapted @DiaAzul 's solution…

BR

4 Likes

Hi @gonhaddock

Is it possible to code this same situation in Pyspark?

Any answers would be appreciated.

Hi @JaeHwanChoi
I’m not a Pyspark user. I don’t think I can support you with this request.
Let’s see if other knimers in forum can help.

BR

assume your data vol is to big to just convert it to pandas df and then back to pyspark?
br

@gonhaddock

Since it is a large data, I don’t think I can convert it to pandas, so I need to put it in one column in Json form within Pyspark Code.

Thanks