# Cluster distances in hierarchical + SDF output + histogram

Hi there, I'm just testing out knime and so far so good. It's a great tool but I had some questions. I believe they haven't been answered as of yet, but i apologize in advance if they have been.

1 - Let's say i input an SDFile of structures and do some calculations on it (Tanimoto,XlogP etc). And I want to output an SDF file out with all those calculations included in the SDFile. How do I do that? I can output the calculations to a csv file and then parse all the data, but is there a direct way?

2 - In order to convert from the internal CDK mode to SDF, I am changing the column name to SDF. Is that the way one does it? Or is there a CDK to SDF translator that I'm missing?

3 - I'm doing a hierarchical clustering of an n by n tanimoto matrix which I constructed. I'd like to be able to get access to the distance matrix or the sorted distances between clusters, which is surely in the background somewhere. Agreed, I can probably contruct it myself easily enough, but was wondering if I am missing something.

4 - I have an n by n table of numbers. I'd like to input all the numbers into a histogram. The histogram node only seems to input in 1 column. Or am I doing it wrong? What is an "x column" anyway?

Thank you!

ohaq595 wrote:
1 - Let's say i input an SDFile of structures and do some calculations on it (Tanimoto,XlogP etc). And I want to output an SDF file out with all those calculations included in the SDFile. How do I do that? I can output the calculations to a csv file and then parse all the data, but is there a direct way?

Currently yes. We have, however, a node that can extract properties in an SDF file into columns in a table and another node that does it the other way round. These node will be available in the next major release.

ohaq595 wrote:
2 - In order to convert from the internal CDK mode to SDF, I am changing the column name to SDF. Is that the way one does it? Or is there a CDK to SDF translator that I'm missing?

The same applies here: with 1.3 there will be nodes to convert from the the internal CDK representation to Smiles, SDF or Mol2.

ohaq595 wrote:
3 - I'm doing a hierarchical clustering of an n by n tanimoto matrix which I constructed. I'd like to be able to get access to the distance matrix or the sorted distances between clusters, which is surely in the background somewhere. Agreed, I can probably contruct it myself easily enough, but was wondering if I am missing something.

You are right, the distance matrix is there, but currently it is not accessible. With 1.3 there will be an improved Hierarchical Clustering node, and maybe we can add an output port that provides the distance matrix. We will discuss this.

Regards,

Thorsten

ohaq595 wrote:
4 - I have an n by n table of numbers. I'd like to input all the numbers into a histogram. The histogram node only seems to input in 1 column. Or am I doing it wrong? What is an "x column" anyway?

Hello,
the x column is the column which is plotted on the x axis of the plott but I will rename it for the next release to 'Binning column' which describes it better.
Yes you can select only one column as x column/binning column. However you can select multiple columns as aggregation column. Each selected aggregation column gets it own bar color and is displayed next to each other.

Best regards,
Tobias