I’m using Python to develop a KNIME extension, and I’d like to use the KNIME Base Chemistry types. The node development documentation says you can use it like so:
def configure(self, config_context): # no input table
""" This node creates two tables with two columns each """
ktype1 = knext.string()
import knime.types.chemistry as cet # needs the extension `KNIME Base Chemistry Types & Nodes` installed
ktype2 = cet.SdfValue
schema1 = knext.Schema([ktype1, ktype2], ["Column with Strings", "Column with Sdf"])
schema2 = knext.Schema([ktype1, ktype2], ["Another column with Strings", "Another column with Sdf"])
return schema1, schema2
I see that the knime.types.chemistry package is available in the Python scripting view in KNIME AP itself, but I’m unable to find a package or an API reference outlining this module. Is there somewhere I can find that for the purpose of autocompletions in VS Code, where I’m developing the rest of the extension?
Hi @dranganath,
Thank you for posting the query here. Allow me to do a rundown for you:
Logical Data Types
The KNIME Base Chemistry Types are the logical data types that need to be installed in KNIME first, and any extension that must use such dependencies should have it mentioned in the knime.yml.
Chemistry Data Types
KNIME Python-based node development framework has the pre-defined implementations of selective data types that makes it easier to map columns with respect to specific types when going from Java to python or vice versa.
Configuration and Execution
The need for defining the column data type in configure
method is to let user know the expected column name and specific data type it will have after the execution of the node.
Once the user executes, and the execution of the node enters the execute
method, the return value will automatically map the column/table to the right data type as inferred by the framework.
Last Resort
If the nature of the node is such that it is not at all possible to identify the expected data type of the output schema after execution, then you can simply return None
in the configure
method. However, this has one downside, that the user will not be able to configure the downstream nodes without executing the current node.
With all said, there is no right way for VS code to know the auto-completion of specific chemistry data types. You can use LOGGER.info()
to find the right value factory strings, displayed in the KNIME console.
For instance you can use, LOGGER.info(column.ktype.logical_type)
to print the available value factory strings for all logical data types, where column
is the knext.Column
in the input_schema
.
Hope it helps.
Best,
Ali
2 Likes
Thank you so much for the reply. This was helpful, and I was able to identify, for example, the MolAdapterCellValueFactory as the type I want to use. However, I’m still not clear on how to actually use this type in my extension, like as part of the schema, or converting a pandas column to this type in the final node output. Do I use the ValueFactory type directly or one of the other classes in the types package, like MolValue? Is there documentation for this somewhere?