I have generated a random forest classification model using RDKit Morgan fingerprints (1024).
I have found the most important bits contributing to the model e.g. Bit 709.
Is there an easy way to find out the corresponding structural fragment/atom from the given bit?
We don’t currently have a KNIME node for this, but if you are also using Python, I can share a workflow that uses the RDKit Python integration to generate images of molecules with the morgan bits highlighted.
Thanks for you reply.
Yes, that would be very helpful if you could share this.
Here’s an example workflow showing how to use the Python scripting node to get the information about fingerprint bits and pass it along to an RDKit Molecule Highlighting node:
Note that you will need to have an RDKit install in the Python environment that KNIME is using.
If you’re using Anaconda python (and I really, really recommend that you do), you can do this by activating the environment that KNIME is using and then doing;
conda install -c rdkit rdkit
I hope this helps,
Many thanks for sharing this.
I am able to run it. Really helpful!
Sorry to bother again but how can I extract these images from the table in high resolution for report/presentation?
I am new to knime, may be there are specific nodes for this?
That’s an interesting question, and I don’t know the answer.
The images are SVG, so you they can be used at essentially any resolution, but I don’t know how to export them to a file.
Since that’s not RDKit specific, it might be a good idea to ask on the main forum if anyone knows how to export SVG cells to files.
Thanks for the suggestion, I’ll post my query there.