At the upcoming RDKit UGM's hackathon we're going to have a hands-on session on writing KNIME nodes for the RDKit. We plan to have participants in that session actually write some nodes.
We have a few ideas for nodes that would be useful, but I figure that the community here might have a few more. So this is your chance! Please let us know what you'd like to be able to do with the RDKit KNIME nodes that you currently can't.
We'll capture whatever ideas come up here and add them to our list.
Will you be porting the 'BRICS/RECAP' work from RDkit over to Knime? I guess there is still stuff to do around MMPs wrt MCSS.
One of the possible nodes on my list was certainly something for BRICS/RECAP. Thanks for reinforcing that.
I think that similaity map is one that i suggest and if possible MIF in knime will be great, started with the initial integration of open3dtools.
From my side, I would be interested to have the possibility to do conformers clustering and alignment with the option to perform it on the whole structure or on a substructure pattern.
Related to your question but probably not qualifying for a hackathon, I think that the formal charge, number of (defined or undefined) chiral center could be added to the RDKit descriptor calculation node.
A more intuitive enummeration node (compared to reaction nodes) for creating virtual libraries.
User provides scaffolds with R-Groups and a second input takes all the possible r-groups. The node then enummerates all the structures. (eg. no reaction and such needed).
The r-group table (or a 3rd input) defines which group can go on which positon on the scaffold eg. R1 can only be H or methyl, while R2 can also be ethyl.
Something like below from ChemDraw:
and then expanded gives you:
I see this isn't exactly easy esepcially due to different formats (smi, mol,...) but this would be really helpful.
best also if user can set a limit when enummeration should stop. ;)
Fragmentation and Rebuilding node (or as separate nodes)
Something like BRICS but MCS based. Find common substructures (fragments) in dataset, split up molecules and recombine these fragments in different ways. Yeah, not very detailed. I guess the MCS part is alreay available, so it's more the recombination part.
If generated fragments are made the same as BRICS fragments, I guess the rebuilding can be done by same node for BRICS and this.