Python Script in Knime doesn't work

Juliane · January 8, 2025, 1:57pm

I have built a script in Google Colab to retrieve additional information via the Vision AI API for image URLs. Unfortunately, I can’t get the script to run in Knime and the AI assistant hasn’t been able to fix the problem yet.

Ich bekomme immer die Fehlermeldung
KnimeUserError: Output table ‘0’ must be of type knime.api.Table or knime.api.BatchOutputTable, but got None. knio.output_tables[0] has not been populated.

Does anyone have an idea how I can solve this? Please share the code if it helps.

MartinDDDD · January 8, 2025, 2:01pm

Can you share the code you are trying to run?

It looks like you are not populating your output to the table port correctly, but more information is needed to help you out.

Regards,

Martin

Juliane · January 8, 2025, 2:06pm

Hi, of course, this is the code:

import knime.scripting.io as knio
import pandas as pd
import requests

# Set your API key here
API_KEY = 'KEY'
VISION_API_URL = f'https://vision.googleapis.com/v1/images:annotate?key={API_KEY}'

# Analyze image with Google Vision API
def analyze_image(image_url):
    request_payload = {
        "requests": [
            {
                "image": {
                    "source": {"imageUri": image_url}
                },
                "features": [
                    {"type": "LABEL_DETECTION"},
                    {"type": "OBJECT_LOCALIZATION"},
                    {"type": "FACE_DETECTION"},
                    {"type": "LANDMARK_DETECTION"},
                    {"type": "SAFE_SEARCH_DETECTION"},
                    {"type": "IMAGE_PROPERTIES"}
                ]
            }
        ]
    }

    response = requests.post(VISION_API_URL, json=request_payload)
    if response.status_code == 200:
        return response.json()
    else:
        print(f"Error: {response.status_code}, {response.text}")
        return None

def process_images(image_urls):
    results = []
    for url in image_urls:
        print(f"Processing {url}...")
        response = analyze_image(url)
        if response and 'responses' in response:
            analysis = response['responses'][0]
            result = {
                'ImageURL': url,
                'Labels': ", ".join(label['description'] for label in analysis.get('labelAnnotations', [])),
                'Objects': ", ".join(obj['name'] for obj in analysis.get('localizedObjectAnnotations', [])),
                'Faces': len(analysis.get('faceAnnotations', [])),
                'Landmarks': ", ".join(landmark['description'] for landmark in analysis.get('landmarkAnnotations', [])),
                'SafeSearch_Adult': analysis.get('safeSearchAnnotation', {}).get('adult', 'UNKNOWN'),
                'SafeSearch_Spoof': analysis.get('safeSearchAnnotation', {}).get('spoof', 'UNKNOWN'),
                'SafeSearch_Medical': analysis.get('safeSearchAnnotation', {}).get('medical', 'UNKNOWN'),
                'SafeSearch_Violence': analysis.get('safeSearchAnnotation', {}).get('violence', 'UNKNOWN'),
                'SafeSearch_Racy': analysis.get('safeSearchAnnotation', {}).get('racy', 'UNKNOWN'),
                'ImageProperties': ", ".join(
                    f"RGB({color['color']['red']},{color['color']['green']},{color['color']['blue']})"
                    for color in analysis.get('imagePropertiesAnnotation', {}).get('dominantColors', {}).get('colors', [])
                )
            }
            results.append(result)
        else:
            print(f"No response for {url}")
            results.append({
                'ImageURL': url,
                'Labels': '',
                'Objects': '',
                'Faces': 0,
                'Landmarks': '',
                'SafeSearch_Adult': 'ERROR',
                'SafeSearch_Spoof': 'ERROR',
                'SafeSearch_Medical': 'ERROR',
                'SafeSearch_Violence': 'ERROR',
                'SafeSearch_Racy': 'ERROR',
                'ImageProperties': ''
            })
    return results

# Main function for KNIME
def main():
    # Read input from KNIME
    df = knio.input_tables[0].to_pandas()

    if 'ImageURL' not in df.columns:
        raise KeyError("The column 'ImageURL' is missing in the input table. Available columns are: " + ", ".join(df.columns))

    # Clean and extract URLs
    image_urls = df['ImageURL'].dropna().tolist()
    image_urls = [url for url in image_urls if isinstance(url, str) and url.strip() != ""]

    # Process images
    results = process_images(image_urls)

    # Convert results to DataFrame
    results_df = pd.DataFrame(results)

    # Ensure the output table is populated even if no results are available
    if results_df.empty:
        results_df = pd.DataFrame(columns=[
            'ImageURL', 'Labels', 'Objects', 'Faces', 'Landmarks',
            'SafeSearch_Adult', 'SafeSearch_Spoof', 'SafeSearch_Medical',
            'SafeSearch_Violence', 'SafeSearch_Racy', 'ImageProperties'
        ])

    # Output results to KNIME
    knio.output_tables[0] = knio.Table.from_pandas(results_df)

# Execute the main function
if __name__ == "__main__":
    main()

MartinDDDD · January 8, 2025, 2:15pm

Can you try to remove this:

if __name__ == "__main__":
    main()

and just do:

main()

I tried your set up with a minimal example and the if__name__==… doesnt seem to fly in the KNIME script node.

Juliane · January 8, 2025, 2:18pm

Thank you! Just replaced it with just main() But the error message remains unfortunatly the same

MartinDDDD · January 8, 2025, 2:22pm

Can you try this:

I obviously don’t have any test data nor an API key so I just pass the input table through with this one.,… for me this runs:

import knime.scripting.io as knio
import pandas as pd
import requests

# Set your API key here
API_KEY = 'KEY'
VISION_API_URL = f'https://vision.googleapis.com/v1/images:annotate?key={API_KEY}'

# Analyze image with Google Vision API
def analyze_image(image_url):
    request_payload = {
        "requests": [
            {
                "image": {
                    "source": {"imageUri": image_url}
                },
                "features": [
                    {"type": "LABEL_DETECTION"},
                    {"type": "OBJECT_LOCALIZATION"},
                    {"type": "FACE_DETECTION"},
                    {"type": "LANDMARK_DETECTION"},
                    {"type": "SAFE_SEARCH_DETECTION"},
                    {"type": "IMAGE_PROPERTIES"}
                ]
            }
        ]
    }

    response = requests.post(VISION_API_URL, json=request_payload)
    if response.status_code == 200:
        return response.json()
    else:
        print(f"Error: {response.status_code}, {response.text}")
        return None

def process_images(image_urls):
    results = []
    for url in image_urls:
        print(f"Processing {url}...")
        response = analyze_image(url)
        if response and 'responses' in response:
            analysis = response['responses'][0]
            result = {
                'ImageURL': url,
                'Labels': ", ".join(label['description'] for label in analysis.get('labelAnnotations', [])),
                'Objects': ", ".join(obj['name'] for obj in analysis.get('localizedObjectAnnotations', [])),
                'Faces': len(analysis.get('faceAnnotations', [])),
                'Landmarks': ", ".join(landmark['description'] for landmark in analysis.get('landmarkAnnotations', [])),
                'SafeSearch_Adult': analysis.get('safeSearchAnnotation', {}).get('adult', 'UNKNOWN'),
                'SafeSearch_Spoof': analysis.get('safeSearchAnnotation', {}).get('spoof', 'UNKNOWN'),
                'SafeSearch_Medical': analysis.get('safeSearchAnnotation', {}).get('medical', 'UNKNOWN'),
                'SafeSearch_Violence': analysis.get('safeSearchAnnotation', {}).get('violence', 'UNKNOWN'),
                'SafeSearch_Racy': analysis.get('safeSearchAnnotation', {}).get('racy', 'UNKNOWN'),
                'ImageProperties': ", ".join(
                    f"RGB({color['color']['red']},{color['color']['green']},{color['color']['blue']})"
                    for color in analysis.get('imagePropertiesAnnotation', {}).get('dominantColors', {}).get('colors', [])
                )
            }
            results.append(result)
        else:
            print(f"No response for {url}")
            results.append({
                'ImageURL': url,
                'Labels': '',
                'Objects': '',
                'Faces': 0,
                'Landmarks': '',
                'SafeSearch_Adult': 'ERROR',
                'SafeSearch_Spoof': 'ERROR',
                'SafeSearch_Medical': 'ERROR',
                'SafeSearch_Violence': 'ERROR',
                'SafeSearch_Racy': 'ERROR',
                'ImageProperties': ''
            })
    return results

# Main function for KNIME
def main():
    # Read input from KNIME
    df = knio.input_tables[0].to_pandas()
    df['ImageURL'] = 1
    if 'ImageURL' not in df.columns:
        raise KeyError("The column 'ImageURL' is missing in the input table. Available columns are: " + ", ".join(df.columns))

    # Clean and extract URLs
    #image_urls = df['ImageURL'].dropna().tolist()
    #image_urls = [url for url in image_urls if isinstance(url, str) and url.strip() != ""]

    # Process images
    #results = process_images(image_urls)

    # Convert results to DataFrame
    results_df = df.copy()

    # Ensure the output table is populated even if no results are available
    if results_df.empty:
        results_df = pd.DataFrame(columns=[
            'ImageURL', 'Labels', 'Objects', 'Faces', 'Landmarks',
            'SafeSearch_Adult', 'SafeSearch_Spoof', 'SafeSearch_Medical',
            'SafeSearch_Violence', 'SafeSearch_Racy', 'ImageProperties'
        ])

    # Output results to KNIME
    knio.output_tables[0] = knio.Table.from_pandas(results_df)

# Execute the main function

main()

mlauber71 · January 8, 2025, 2:23pm

@Juliane what I have seen is that in more complicated Python codes the KNIME node does not seem to recognize that an output table will later be filled.

What I have sometimes done is to initiate a dummy output table at the start that (hopefully) will later be filled with the real data. Depending on your task you might have to add some error handling in case the result is not OK.

So mabye before the main code you put:

data = {
  "column1": [1, 2, 3],
  "column2": [4, 5, 6]
}

#load data into a DataFrame object:
df = pd.DataFrame(data)

knio.output_tables[0] = knio.Table.from_pandas(df)

Juliane · January 8, 2025, 2:48pm

Thank you, Martin! Now the original error message is gone, but now I only get a table where there is a 1 in every row

MartinDDDD · January 8, 2025, 2:53pm

Yeah that is what I did on my end as I cant run your code…

You can change it back where I made comments in all UPPERCASE:

def main():
    # Read input from KNIME
    df = knio.input_tables[0].to_pandas()
    df['ImageURL'] = 1 # REMOVE THIS!!!
    if 'ImageURL' not in df.columns:
        raise KeyError("The column 'ImageURL' is missing in the input table. Available columns are: " + ", ".join(df.columns))

    # Clean and extract URLs
    #image_urls = df['ImageURL'].dropna().tolist() ##UNCOMMENT
    #image_urls = [url for url in image_urls if isinstance(url, str) and url.strip() != ""] ## UNCOMMENT

    # Process images
    #results = process_images(image_urls) ## UNCOMMENT

    # Convert results to DataFrame
    results_df = df.copy() ##CHANGE BACK TO WHAT WAS IN YOUR SCRIPT

Question is: Does it work once you reverse those changes and actually ping your API (also add your API key of course…)

Juliane · January 8, 2025, 3:04pm

Yes, the script it works, but again I have the problem that it only outputs the URLs of the source file and I don’t know whether the API call worked or simply the retrieved content is not output.

Juliane · January 8, 2025, 3:10pm

Thank you!I have tried it, but then I only get back the input dummies of the data fields.

MartinDDDD · January 8, 2025, 3:15pm

ok sorry about that - based on the code I thought it might not be a problem for you to revert my changes that I made to allow me to execute…

Here’s your original code with the if name = main removed…

If that doesn’t do it them I’m afraid I’m out of ideas

import knime.scripting.io as knio
import pandas as pd
import requests

# Set your API key here
API_KEY = 'KEY'
VISION_API_URL = f'https://vision.googleapis.com/v1/images:annotate?key={API_KEY}'

# Analyze image with Google Vision API
def analyze_image(image_url):
    request_payload = {
        "requests": [
            {
                "image": {
                    "source": {"imageUri": image_url}
                },
                "features": [
                    {"type": "LABEL_DETECTION"},
                    {"type": "OBJECT_LOCALIZATION"},
                    {"type": "FACE_DETECTION"},
                    {"type": "LANDMARK_DETECTION"},
                    {"type": "SAFE_SEARCH_DETECTION"},
                    {"type": "IMAGE_PROPERTIES"}
                ]
            }
        ]
    }

    response = requests.post(VISION_API_URL, json=request_payload)
    if response.status_code == 200:
        return response.json()
    else:
        print(f"Error: {response.status_code}, {response.text}")
        return None

def process_images(image_urls):
    results = []
    for url in image_urls:
        print(f"Processing {url}...")
        response = analyze_image(url)
        if response and 'responses' in response:
            analysis = response['responses'][0]
            result = {
                'ImageURL': url,
                'Labels': ", ".join(label['description'] for label in analysis.get('labelAnnotations', [])),
                'Objects': ", ".join(obj['name'] for obj in analysis.get('localizedObjectAnnotations', [])),
                'Faces': len(analysis.get('faceAnnotations', [])),
                'Landmarks': ", ".join(landmark['description'] for landmark in analysis.get('landmarkAnnotations', [])),
                'SafeSearch_Adult': analysis.get('safeSearchAnnotation', {}).get('adult', 'UNKNOWN'),
                'SafeSearch_Spoof': analysis.get('safeSearchAnnotation', {}).get('spoof', 'UNKNOWN'),
                'SafeSearch_Medical': analysis.get('safeSearchAnnotation', {}).get('medical', 'UNKNOWN'),
                'SafeSearch_Violence': analysis.get('safeSearchAnnotation', {}).get('violence', 'UNKNOWN'),
                'SafeSearch_Racy': analysis.get('safeSearchAnnotation', {}).get('racy', 'UNKNOWN'),
                'ImageProperties': ", ".join(
                    f"RGB({color['color']['red']},{color['color']['green']},{color['color']['blue']})"
                    for color in analysis.get('imagePropertiesAnnotation', {}).get('dominantColors', {}).get('colors', [])
                )
            }
            results.append(result)
        else:
            print(f"No response for {url}")
            results.append({
                'ImageURL': url,
                'Labels': '',
                'Objects': '',
                'Faces': 0,
                'Landmarks': '',
                'SafeSearch_Adult': 'ERROR',
                'SafeSearch_Spoof': 'ERROR',
                'SafeSearch_Medical': 'ERROR',
                'SafeSearch_Violence': 'ERROR',
                'SafeSearch_Racy': 'ERROR',
                'ImageProperties': ''
            })
    return results

# Main function for KNIME
def main():
    # Read input from KNIME
    df = knio.input_tables[0].to_pandas()

    if 'ImageURL' not in df.columns:
        raise KeyError("The column 'ImageURL' is missing in the input table. Available columns are: " + ", ".join(df.columns))

    # Clean and extract URLs
    image_urls = df['ImageURL'].dropna().tolist()
    image_urls = [url for url in image_urls if isinstance(url, str) and url.strip() != ""]

    # Process images
    results = process_images(image_urls)

    # Convert results to DataFrame
    results_df = pd.DataFrame(results)

    # Ensure the output table is populated even if no results are available
    if results_df.empty:
        results_df = pd.DataFrame(columns=[
            'ImageURL', 'Labels', 'Objects', 'Faces', 'Landmarks',
            'SafeSearch_Adult', 'SafeSearch_Spoof', 'SafeSearch_Medical',
            'SafeSearch_Violence', 'SafeSearch_Racy', 'ImageProperties'
        ])

    # Output results to KNIME
    knio.output_tables[0] = knio.Table.from_pandas(results_df)

# Execute the main function

main()

MartinDDDD · January 8, 2025, 3:18pm

Last idea I have is to just remove the main() function and let the script execute the code directly - if it is really about complexity as @mlauber71 suggests than not having the return table wrapped inside a function may help…

import knime.scripting.io as knio
import pandas as pd
import requests

# Set your API key here
API_KEY = 'KEY'
VISION_API_URL = f'https://vision.googleapis.com/v1/images:annotate?key={API_KEY}'

# Analyze image with Google Vision API
def analyze_image(image_url):
    request_payload = {
        "requests": [
            {
                "image": {
                    "source": {"imageUri": image_url}
                },
                "features": [
                    {"type": "LABEL_DETECTION"},
                    {"type": "OBJECT_LOCALIZATION"},
                    {"type": "FACE_DETECTION"},
                    {"type": "LANDMARK_DETECTION"},
                    {"type": "SAFE_SEARCH_DETECTION"},
                    {"type": "IMAGE_PROPERTIES"}
                ]
            }
        ]
    }

    response = requests.post(VISION_API_URL, json=request_payload)
    if response.status_code == 200:
        return response.json()
    else:
        print(f"Error: {response.status_code}, {response.text}")
        return None

def process_images(image_urls):
    results = []
    for url in image_urls:
        print(f"Processing {url}...")
        response = analyze_image(url)
        if response and 'responses' in response:
            analysis = response['responses'][0]
            result = {
                'ImageURL': url,
                'Labels': ", ".join(label['description'] for label in analysis.get('labelAnnotations', [])),
                'Objects': ", ".join(obj['name'] for obj in analysis.get('localizedObjectAnnotations', [])),
                'Faces': len(analysis.get('faceAnnotations', [])),
                'Landmarks': ", ".join(landmark['description'] for landmark in analysis.get('landmarkAnnotations', [])),
                'SafeSearch_Adult': analysis.get('safeSearchAnnotation', {}).get('adult', 'UNKNOWN'),
                'SafeSearch_Spoof': analysis.get('safeSearchAnnotation', {}).get('spoof', 'UNKNOWN'),
                'SafeSearch_Medical': analysis.get('safeSearchAnnotation', {}).get('medical', 'UNKNOWN'),
                'SafeSearch_Violence': analysis.get('safeSearchAnnotation', {}).get('violence', 'UNKNOWN'),
                'SafeSearch_Racy': analysis.get('safeSearchAnnotation', {}).get('racy', 'UNKNOWN'),
                'ImageProperties': ", ".join(
                    f"RGB({color['color']['red']},{color['color']['green']},{color['color']['blue']})"
                    for color in analysis.get('imagePropertiesAnnotation', {}).get('dominantColors', {}).get('colors', [])
                )
            }
            results.append(result)
        else:
            print(f"No response for {url}")
            results.append({
                'ImageURL': url,
                'Labels': '',
                'Objects': '',
                'Faces': 0,
                'Landmarks': '',
                'SafeSearch_Adult': 'ERROR',
                'SafeSearch_Spoof': 'ERROR',
                'SafeSearch_Medical': 'ERROR',
                'SafeSearch_Violence': 'ERROR',
                'SafeSearch_Racy': 'ERROR',
                'ImageProperties': ''
            })
    return results

# Main function for KNIME

# Read input from KNIME
df = knio.input_tables[0].to_pandas()

if 'ImageURL' not in df.columns:
    raise KeyError("The column 'ImageURL' is missing in the input table. Available columns are: " + ", ".join(df.columns))

# Clean and extract URLs
image_urls = df['ImageURL'].dropna().tolist()
image_urls = [url for url in image_urls if isinstance(url, str) and url.strip() != ""]

# Process images
results = process_images(image_urls)

# Convert results to DataFrame
results_df = pd.DataFrame(results)

# Ensure the output table is populated even if no results are available
if results_df.empty:
    results_df = pd.DataFrame(columns=[
        'ImageURL', 'Labels', 'Objects', 'Faces', 'Landmarks',
        'SafeSearch_Adult', 'SafeSearch_Spoof', 'SafeSearch_Medical',
        'SafeSearch_Violence', 'SafeSearch_Racy', 'ImageProperties'
    ])

# Output results to KNIME
knio.output_tables[0] = knio.Table.from_pandas(results_df)

Juliane · January 8, 2025, 7:18pm

Wow, thanks for the great support! The processing has now worked and the table is also output correctly. THANK YOU

MartinDDDD · January 9, 2025, 6:52am

Awesome! Glad it worked out!

If you don’t mind: Flag the post that had the code that ended up working as solution - this way it’s flagged at the top for anyone who stumbles across this topic when searching for a similar problem :-).

system · January 16, 2025, 6:52am

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.