How to find corresponding structural fragment of RDKit fingerprint bit

#1

Hi all,

I have generated a random forest classification model using RDKit Morgan fingerprints (1024).
I have found the most important bits contributing to the model e.g. Bit 709.
Is there an easy way to find out the corresponding structural fragment/atom from the given bit?

Many thanks
San

0 Likes

#2

Hi San,

We don’t currently have a KNIME node for this, but if you are also using Python, I can share a workflow that uses the RDKit Python integration to generate images of molecules with the morgan bits highlighted.

-greg

0 Likes

#3

Dear Greg,

Thanks for you reply.
Yes, that would be very helpful if you could share this.

San

0 Likes

#4

HI San,

Here’s an example workflow showing how to use the Python scripting node to get the information about fingerprint bits and pass it along to an RDKit Molecule Highlighting node:


Note that you will need to have an RDKit install in the Python environment that KNIME is using.
If you’re using Anaconda python (and I really, really recommend that you do), you can do this by activating the environment that KNIME is using and then doing;
conda install -c rdkit rdkit

I hope this helps,
-greg

0 Likes

#5

Hi Greg,

Many thanks for sharing this.
I am able to run it. Really helpful!

San

0 Likes

#6

Hi Greg

Sorry to bother again but how can I extract these images from the table in high resolution for report/presentation?
I am new to knime, may be there are specific nodes for this?

San

0 Likes

#7

That’s an interesting question, and I don’t know the answer.
The images are SVG, so you they can be used at essentially any resolution, but I don’t know how to export them to a file.
Since that’s not RDKit specific, it might be a good idea to ask on the main forum if anyone knows how to export SVG cells to files.

-greg

0 Likes

#8

Thanks for the suggestion, I’ll post my query there.

-San

0 Likes