KNIME node development chemistry types

I’m using Python to develop a KNIME extension, and I’d like to use the KNIME Base Chemistry types. The node development documentation says you can use it like so:

def configure(self, config_context): # no input table
  """ This node creates two tables with two columns each """
  ktype1 = knext.string()
  import knime.types.chemistry as cet # needs the extension `KNIME Base Chemistry Types & Nodes` installed
  ktype2 = cet.SdfValue
  schema1 = knext.Schema([ktype1, ktype2], ["Column with Strings", "Column with Sdf"])
  schema2 = knext.Schema([ktype1, ktype2], ["Another column with Strings", "Another column with Sdf"])
  return schema1, schema2

I see that the knime.types.chemistry package is available in the Python scripting view in KNIME AP itself, but I’m unable to find a package or an API reference outlining this module. Is there somewhere I can find that for the purpose of autocompletions in VS Code, where I’m developing the rest of the extension?

Hi @dranganath,

Thank you for posting the query here. Allow me to do a rundown for you:

Logical Data Types

The KNIME Base Chemistry Types are the logical data types that need to be installed in KNIME first, and any extension that must use such dependencies should have it mentioned in the knime.yml.

Chemistry Data Types

KNIME Python-based node development framework has the pre-defined implementations of selective data types that makes it easier to map columns with respect to specific types when going from Java to python or vice versa.

Configuration and Execution

The need for defining the column data type in configure method is to let user know the expected column name and specific data type it will have after the execution of the node.

Once the user executes, and the execution of the node enters the execute method, the return value will automatically map the column/table to the right data type as inferred by the framework.

Last Resort

If the nature of the node is such that it is not at all possible to identify the expected data type of the output schema after execution, then you can simply return None in the configure method. However, this has one downside, that the user will not be able to configure the downstream nodes without executing the current node.

With all said, there is no right way for VS code to know the auto-completion of specific chemistry data types. You can use LOGGER.info() to find the right value factory strings, displayed in the KNIME console.
For instance you can use, LOGGER.info(column.ktype.logical_type) to print the available value factory strings for all logical data types, where column is the knext.Column in the input_schema.

Hope it helps.

Best,
Ali

2 Likes

Thank you so much for the reply. This was helpful, and I was able to identify, for example, the MolAdapterCellValueFactory as the type I want to use. However, I’m still not clear on how to actually use this type in my extension, like as part of the schema, or converting a pandas column to this type in the final node output. Do I use the ValueFactory type directly or one of the other classes in the types package, like MolValue? Is there documentation for this somewhere?