Several questions about KNIME extensions python development

Hello everyone,

I have a few questions about KNIME extension development and I would appreciate any help you can provide. Although I have developed several extensions, I still encounter some difficulties with some of them.

Is there a way to debug a KNIME extension from PyCharm? Currently, I use the LOGGER.info function, which is not very convenient.

Can a KNIME extension be reloaded without having to reopen the entire KNIME platform?

How can I ensure that a knext.ColumnParameter always has a default value, such as the first value in the list?

I created an extension that accepts an RDKit mol object and returns another RDkit mol object, following the new chemical types introduced in 4.7. My declarations look like this:

output_schema = copy.copy(input_schema) output_schema.append(knext.Column(Chem.Mol, "Standardized Molecules (UNIVIE)"))

My dataframe contains an RDKit Mol object, but after execution, I receive a warning that "DataSpec generated by configure does not match spec after execution." How can I debug this issue?

Thank you in advance for your assistance.

1 Like

Hi pirotex,

  1. Debugging in VS Code:
    To debug Python code in VS Code even though it is executed from KNIME, perform the follwing steps:

    1. in the python environment that you use for running your python code, install debugpy using pip install debugpy
    2. have visual studio code and the python extension installed
    3. use VS Code to open the folder of the plug-in you want to develop (e.g.: org.knime.python3.arrow.types.tests )
    4. open the python file where you want to debug, by pressing Ctrl+P and start typing the file name
    5. inject the following lines of code where you want to start debugging:
    import debugpy
    debugpy.listen(5678)
    print("Waiting for debugger attach")
    debugpy.wait_for_client()
    debugpy.breakpoint()
    
    1. execute your python code. E.g. by running the Python Script node or by running the test you are interested in.
    2. once you see Waiting for debugger attach in the Eclipse console or Log output, go to VSCode and open the run & debug view (Ctrl+Shift+D), click Run and Debug, choose Remote debugging , localhost (should be default), and Port 5678 (default). (if you do not have a python file open while clicking here, it will not show the python debugging options!)
  2. Debugging in PyCharm: we did not use PyCharm in this context, but are very happy if you adapt our VSC solution and provide a quick manual on how to use it :slight_smile:

  3. Whether a restart of KNIME Analytics Platform is necessary is described in detail by the bullet points in this section: Create a New Python based KNIME Extension

  4. Defining the default value of the ColumnParameter: this is not possible and for now we do not intend to implement that. This autoconfiguration can lead to unexpected behaviour and we want users to explicitly choose their required column.

  5. If you open the output table after execution, you can compare the spec and how the actual table looks like. This should suffice for some of the DataSpec issues. But you are right, this one is tricky. In general, you can have a look at all available data types by executing knime.extension.supported_value_types() (see infobox in this section: Create a New Python based KNIME Extension ) and maybe have a look at the log, but according to this example, it looks correct. Further up in that example, in the if_smiles method, you see that some values can look identical in the rdkit context for technical reasons, but are not completely the same. You should see the different types also with the aforementioned knime.extension.supported_value_types(). Now, to compare them directly in the execute method, I suggest the following:

    1. Print out the types of the incoming table, e.g. like so:
        def execute(self, exec_context, input_1):
    
        df: pd.DataFrame
        df = input_1.to_pandas()
        for col in df.columns:
            print(type(df[col][0]))
        # print(knext.supported_value_types())
        return input_1
    
    1. Print out the data types of the outgoing table according to the above example

Do all of these points help and answer your questions?

Best regards
Steffen

2 Likes

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.