request nodes


I request two series of node :

1 for 3D generation with the class present in the cdk and another one if it is possible with openbabel obgen module or java integration of it (if exist!)

2 Pharmacophore generation and search , Rajarshi Guha developed a series of java class for cdk that can handle with this features.(

Is it possible to have sach a tools in a series of nodes? I think this for comunity could be a great improvement in knime functionality 


Thanks a lot for any help


apologies for the delay in reply. I will have to look into the 3D classes that are available in CDK and how well they work. According to the developers mailing list there is some work in that area going on currently.

Same applies for the Pharmacophores. I expect to have some time to look into that next week.

Regarding the search tools: What in particular have you in mind?



dear stephan,

first of all thanks for your reply. Second I thought to have a serch node that are able to find with a custonizable number and type of features match with a customizable tollerance (1-2 A or more ) and align the results to the query hyphotesys ...

I know is a little bit long abd hard but i think that CDK are the right tools to do that. What do you think about?



Dear fab,

that sounds a lot like a workflow to me. Let's assume you want to find all molecules that share two parameters within a given tolernace (reference is the pharmacophore), e.g., mass and hydrogen donors. In this case you would calculate the properties with the 'molecule properties' node, filter by the parameter bounds using the reference, and then cluster the resulting table using one of the suites available.

Anyhow, I am not entirely sure how CDK deals with Pharmacophores internally. I will look into it and get back to you. Perhapbs it is more intuitive and simpler to have a separate node for this kind of similarity search.

Best regards,


Hi Fab,

the nightly build now includes a node for 3D coordinates generation. However, the underlying model builder is far from perfect and cannot deal with all types of chemical classes. More details are in the help xml of the node.

In addition, distance moments calculation and distance similarity are available in separate nodes for 3D structures.


dear stephan

Sorry for the delay in my answer .....this sound a good job, of course i agree that a separate node are better ....for the rest i found that from cdk 1.4 version there is a package org.opensience.cdk.pharmacophore added by guha, this handle with search and match by a query in xml and i suppose find a series of row ph4 ...this use the smart parser definition.


thanks tomorrow when the nodes are available i'll try to use them


thanks and great job 

Hi stephan,

Any plan to use cdk for nmr prediction. HOSE code is already there in the atom signatures node. As you are aware. CDK nmr prediction is based on HOSE code and CDK has a method to predict NMRshift based on HOSE code. There's even a method in cdk for the prediction.

Its difficult to use the shift + hose code value file in knime  to predict NMR so any such implementation would be very welcome!


thanks a lot



I think that's a great idea. It coincides with some of our own efforts and will probably happen rather sooner than later. I will discuss that and post any updates on the topic.



the CDK nmr prediction method is based on the Bremser one sphere HOSE codes which is probably not the most precise method to predict nmr spectra and therefore, I don't know if it would be worth build a node for this prediction. We could implement it, if you are interested.

The way I'm using the atom signatures node to predict NMR spectra is by accessing a local database and pulling the atom sigantures and corresponding shifts into a knime workflow. Then I average the shifts for each signatures and join the resulting table with the one with the atom signatures of the molecule, using the atom signature as match criteria.

Which  shift + hose code value file are you refering to in your post?



Hi Luis,

I was indeed referring to the Bremser method which although old can give a rough idea. nmrshiftdb can be parsed with some hoop jumping to obtain shift values and corresponding C atom numbers in a molecule. But then one needs to loop over all C atoms and match the atom signatures that correspond to the particular C atoms with given shifts.

I could not find an easy way to do this. It would be great if you share any workflow that can work with nmrshiftdb to train and predict nmr shifts. 




I have attached an example of a workflow for predicting 13C NMR spectra using the data from NMRShiftDB. Note that we are currently using Atom Signatures instead of HOSE codes. I have also attached a KNIME table file so that you can load the data again in the workflow. Unfortunately, I had to remove a few rows from the input table because of the attachment size limit. But I can send you the original table per email (please contact me at Luis.deFigueiredo at



Super! thanks a lot Luis

Luis, that's a fabulous use of KNIME and open data!

May I encourage you to repost the workflow on with the full dataset?

(the other) Simon

And/Or the knime public server?



I have uploaded the workflow to with the full dataset. You can donwload it from here:

I have also updated the atom signature node, which now adds a new column with the atom ID from which the signatures were generated. This way one can easily map the shifts into the chemical structure. To see the atom IDs in the chemical structure go to File>Preferences, then in the drop-down list go to KNIME>Chemistry>CDK and select the option to show the numbering of the carbon atoms. The new node is available in the nightly build.




I have realized that the workflow was exported without the data. I have uploaded a new version with the NMRShiftDB data but I had to reset some of the nodes in order to overcome the 20 MB file size limitation.