Determine DataType of chemical structure cell (and auto-conversion)

As a primer, I'm new to making own nodes.

My question is how I can assure a certain column does contain a chemical structue in a specific format?

I can check like this:

type.isCompatible(SmilesValue.class);

But I wasn't really able to determine what this would return for example for a String column?

Next question are conversions. Initially SMILES were loaded and then a calculation done with a CDK node. As far as I can tell this will automatically convert a smiles column to a CDK column (CML). Similar in case of the other toolkits like in RDKit. In this cas will above code still be true? (copy & paste of the cell content is still smiles).

Ulimtaltey how can I determine the chemical structure format of a column?

The

 contain a section about this topic ("Chemistry Conventions", pp 15.). This should make it a bit clearer.

It helps but the document is a bit on the technical side and doesn't really answer my question. So I will be more specific.

If I load SMILES, then to something with CDK which auto-converts it to a CDK cell what will the result of

isAdaptable(SmilesValue.class)

be? (and for isCompatible?)

And what will be the content of the cell when calling cell.toString()? The original smiles?

I'm only interested in reading the content of the "molecule cell" only and determine the type for further processing.

 

isAdaptable(SmilesValue) will be true, because auto-conversion must still retain the original representation. isCompatible(SmiledValue) is also true, but this is not strictly required. Every CDK column, no matter what the original representation was, is also compatible to Smiles for historical reasons.

You should never call toString() on a cell to get the contents. Use the methods defined the value interfaces (e.g. getSmilesValue() for SmilesValue) instead. There is no rule what toString should return.