mol2 writer does not encode molecular title

Hi all -
I am new to Knime, and am generally very happy with it. I am mostly using it to prepare libraries of compounds for virtual screening: taking SMILES strings of millions of compounds, convert to 3D representations, minimize, Lipinski Filtering, etc., and then finally writing to mol2 format with the “mol2 writer” (in the chemistry I/O node collection).

My initial SMILES file contains the structure (“col0”) and a catalog number (aka title; as “col1”) for each molecule, which is read initially and passed through each node. I confirm that “col1” is passed through each node with the MarvinView tool.

From what I can tell, the mol2 writer node does not seem to have functionality to write titles, which is unfortunate for virtual screening, since the initial catalog number is very useful However, the SDF writer is more robust, allowing writing of titles. (But mol2 is preferred for virtual screening, since it supports partial atomic charges)

Currently if I want to encode the mol2 with a title data, I need to write the mol2 file and then list of titles (“col1”) as a CSV file, and merge them with an outside script, which is tedious.

I guess I am wondering if the mol2 writer can be modified to act like the SDF writer, and allow incorporation of the title? I think this would improve the usefulness of the mol2 writer. Thanks!!

=mxasf=

1 Like

Hi mxasf,
As far as I can see the mol2 writer node assigns by default the rowID as “molecule title”. So, all you have to do to have your titles (in “col1”) as molecule titles in the mol2 file, is to use the RowID node beforehand and replace the rowID with your “col1”. At that point when you generate the mol2 files, you’ll have them as “molecule titles”.
I hope this helps.
Gio

P.S.: By the way it would be good to have such option directly in the mol2 writer node, as you suggested.

2 Likes

@mxasf, can you please mark this answer as solution if it solved your problem? If you need other suggestions, just ask.
Thanks

Hi Gio
thanks for this idea. I was very excited to try this, but sadly it didn’t seem to work. What I did (as you suggested):
a) prior to the Mol2Writer node, I inserted the “RowID” node. It is now configured to convert “col1” (title) to the RowID.
b) Using “MarvinView”, I confirmed “RowID” was doing the correct thing.
c) Sadly, I looked at the output of Mol2 Writer (node reset to run freshly) with a text editor, and the title is still not being filled in, but just leaving a blank line, see below.
d) As a further test, I put an SDF writer node after “RowID”, and told it to use the RowID as the molecular title. This did work like it should.

Looking at the mol2 file created by Knime (as above):
@MOLECULE

45 45 1
… etc.

Versus a mol2 file created by Pipeline Pilot (same sort of protocol, and which encodes the title 2386-0094)
@MOLECULE
2386-0094
45 45 1 0 0 0
… etc

Thanks for your help
=mxasf=