Retrieving 3D co-ordinates from an SD file to calculate inter-atomic distances & bond angles.

 

Hi,

     I have recently downloaded the Knime desktop workbench along with the CDK, Indigo and RDKit Chemistry extensions and am currently getting familiarized with them. I am interested in reading an SD file containing information about a molecule, retrieving 3D co-ordinates from the file and writing them to an output file. I tried doing this by using an SDF Reader to read the file and then using the "Molecule to RDKit", "RDKit Generate Coords" and "XLS Writer" nodes in succession to generate the output file with co-ordinates for each atom but couldn't get it to work as needed. Is it possible to extract just the co-ordinates from an SD file and get them in a separate output file without all the other blocks of data in the SD file such as its header, bond count, etc. or do I have to write some code for it (in a node like the Java snippet or Python snippet node) ?

 

    My intention is to retrieve the 3D co-ordinates for all the atoms from an SD file, store them in a separate file and use them to calculate inter-atomic distances and bond angles. Does Knime have nodes or extensions which could be used for doing these calculations ? Kindly let me know as soon as possible.

 

Thanks,

Ajit.

 

Hi Ajit,

I do not believe this is possible with the Community nodes (CDK, Indigo, RDKit), but I do know the MOE nodes can do this with a node called "Atomic Positions".

However, these nodes are not for free.

Simon.

 

Hi Simon,

                  First of all, thanks a lot for your prompt reply. Can I get an estimate as to how much the MOE nodes would cost and what pricing plans exist for them ? Alternatively, if I just wish to get the first part of my problem sorted out for now, i.e., retrieve just the x, y, z 3D co-ordinates from an SD file (containing header, atom co-ordinates, bond values, etc.) into an output file (e.g., in separate columns on an excel sheet), how do you suggest I go about doing it ? Would the RDKit, CDK or Indigo nodes accomplish that ?

 

                  Looking forward to your reply.

 

Thanks,

Ajit.

 

Hi Simon,
                  Can I get an estimate as to how much the MOE nodes would cost and what pricing plans exist for them ? 
                  Looking forward to your reply.
 
Thanks,
Ajit.

Hi Ajit,

To the best of my knowledge, the CDK, Indigo, or RDKit nodes cannot retrieve the 3D coordinates out of the SD file.

If you view the SDF cells as an SDF string (right click on the column and choose SDF String) you can see the 3D coordinates there.

Try using the "Rename" node to convert the SDF into a String, and then some combination of the "Cell Splitter" and "Cell Splitter by Position" nodes will probably allow you to do what you want, but not very elegantly.

For the MOE nodes, I do not know the answer to this as I am not affiliated with them. But they are regularly contactable at: support@chemcomp.com

Thanks,

Simon.

Hi Simon,

               Thanks for replying. I have contacted Chemcomp about the price of the MOE nodes. Meanwhile, I'm trying to get the basics of my problem done within KNIME. I used "Column Rename" to get the contents of the SD file as a String into a table and then using the "Cell Splitter", "String Manipulation" and "String Replacer" nodes to get rid of the unnecessary data. However, the contents of the SD file are being read into a single row and stored in a single cell (the first cell) on the excel sheet. Subsequently, when I use the "Cell Splitter", everything gets stored in different cells on the first row.

 

           Is there any way to have the SD file's data stored in the first cell on different rows like it originally was in the file itself ? For the output, I wish to get an excel sheet with data on separate rows and columns. Kindly let me know if this is possible within KNIME.

 

Thanks,

Ajit.

Its good to hear that you've got the hard part done.

Now use the Create Collection Column node on all the columns containing the coordinates. Then use the UnGroupBy node on this collection column, and you should now have them all in separate rows.

 

You could however have saved a few steps, by choosing Output "As A List" in the Cell Splitter node. This way, all you needed to use was the UnGroup node at the end.

 

Hope this helps, 

 

Simon.

Hi Simon,

              Thanks a lot for your guidance. I have another query. I have managed to get the data from the SD file written (as String values) to an XLS sheet as a 10-column grid with multiple rows. I now want to generate two separate tables out of it. The current table has 100+ rows with 10 columns in each. I want to have rows where the 1st cell/ column contains a positive numeric value (e.g., "1", "2") in one grid/ table (of 10 columns) and all the rows where the 1st cell/ column has a float value (e.g., "5.4608", "-8.9067") in the second grid/ table (of 6 columns). How do you suggest I go about doing this ? 

 

         I'm currently trying to code something for it in the Java Snippet but keep getting stuck. I would be very grateful if you could give me a few tips on how to accomplish this.

 

Thanks,

Ajit.

 

Hi,

Well dont ask me about Java and other scripting snippets, I know nothing about them I am afraid, as others on the forum would testify :-)

However, you can achieve as desired in 2 nodes;

Use the maths node with the expression;

if(round($Col1$)==$Col1$,1 ,0)

Then use a Row Splitter, to split on this new column. You now have two tables.

 

The maths expression is basically checking if the rounded up/down value of column 1 equals the original value of column 1 (which will be the case for integers, and not the case for floating values).

Hope this helps

Simon.

Hi,

     Thanks. I was trying to do something similar via the Java snippet node but was getting an error as the integer and float values are stored in the XL sheet in String format and the math functions work on numeric values only. I wasn't sure about parsing the values in integer format as that would affect the float values as well.

 

     However, I will try implementing your suggestions and will let you know how it goes. 

 

Best Wishes,

Ajit.

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.