Help with Python Script to import file

Quick request for advice / help. New to KNIME but used alteryx a lot previously

I am trying to use Python to read a CSV file that has several columns that might be Json dumps. The python code loads the columns

Python Code works correctly:

import pandas as pd
import json

# Read the CSV file
df = pd.read_csv('/Users/howetimms/knime-workspace/PII/HI/LMX.csv')

# Load JSON columns as Python objects
json_columns = ['Main API', 'Data Block One', 'Data Block Two', 'Data Block Three', 'Data Block Four', 'Data Block Five']
for column in json_columns:
    df[column] = df[column].apply(lambda x: json.loads(x) if isinstance(x, str) else x)

print(df)

Code runs in python as expected

I thought I might be able to use Python Script node to run this step within Knime

Running the code within the Python Script node it appears run… but the result is ArrowTable[shape=(7,4)]
And I (obviously) get no table output from the node

Can someone please help or direct me
Appreciated

** Edited to show the output from running the code in VSC as python **

Hi @HoweTimms,

If you print(df), does that give you your desired output? The method knio.Table.from_pandas returns an ArrowTable, which in line 14 is used for the output knio.output_tables[0]. That is correct and behaves as it should.

I do not see why you should not get a table output from the node. If you execute the node, right-click and show the output Table (see screenshot), what does it show?

Best regards
Steffen

1 Like

Thank you for the super fast response

Yes. If I run print(df) in python outside of KNIME I get exactly the response I want. This is a table that is exactly as I would want to use as an output from the KNIME python node

When I execute the KNIME python node I get the following (No result)

No apparent errors in the console
(see a clip of the workflow below)

Although you will see I used a Variable Creator to enable me to run the Python Script node as without this I couldn’t get it to execute

Please let me know if there is any more I can try (I can’t easily share a copy of the CSV as the JSON files are large and contain sensitive data… but potentially I could create a few dummy records)

Hi @HoweTimms,

ah, your node is not yet configured. For the overall understanding, have a look here and scroll down to “what is a node status?”

In this specific scenario, your Python Script node expects an input table (via the small triangle at the left of the node; this small triangle is known as “input port”). Because it gets no input table, the whole node is “not configured” (indicated by the traffic light being red). As you do not have an input table, but read it from within the Python Script node, you need to remove the input port. Do so by clicking on the three dots and navigate through to remove the input table port. Then the traffic light should switch to yellow. Now you can right-click and execute the node. Then it should be green and you should have your data in the output table.

Does that work?

Best regards
Steffen

2 Likes

It absolutely does! Thank you

The issue I had was that it was configured for an input port
Once I right clicked and removed that input port I was able to execute the node

CleanShot 2023-06-26 at 16.02.13

And the result is exactly as per the Python script (redacted but the JSON is unpacked as I had hoped)

Appreciate the help. Small ‘gotchas’ like this can take a while to learn

1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.