How to find corresponding structural fragment of RDKit fingerprint bit

Hi all,

I have generated a random forest classification model using RDKit Morgan fingerprints (1024).
I have found the most important bits contributing to the model e.g. Bit 709.
Is there an easy way to find out the corresponding structural fragment/atom from the given bit?

Many thanks
San

Hi San,

We don’t currently have a KNIME node for this, but if you are also using Python, I can share a workflow that uses the RDKit Python integration to generate images of molecules with the morgan bits highlighted.

-greg

Dear Greg,

Thanks for you reply.
Yes, that would be very helpful if you could share this.

San

HI San,

Here’s an example workflow showing how to use the Python scripting node to get the information about fingerprint bits and pass it along to an RDKit Molecule Highlighting node:
https://kni.me/w/sfrKZIZ6iR4E_av5
Note that you will need to have an RDKit install in the Python environment that KNIME is using.
If you’re using Anaconda python (and I really, really recommend that you do), you can do this by activating the environment that KNIME is using and then doing;
conda install -c rdkit rdkit

I hope this helps,
-greg

Hi Greg,

Many thanks for sharing this.
I am able to run it. Really helpful!

San

Hi Greg

Sorry to bother again but how can I extract these images from the table in high resolution for report/presentation?
I am new to knime, may be there are specific nodes for this?

San

That’s an interesting question, and I don’t know the answer.
The images are SVG, so you they can be used at essentially any resolution, but I don’t know how to export them to a file.
Since that’s not RDKit specific, it might be a good idea to ask on the main forum if anyone knows how to export SVG cells to files.

-greg

Thanks for the suggestion, I’ll post my query there.

-San