MMP Molecule Fragment (RDKit) yields empty table

evert.homan_scilifelab.se · June 24, 2020, 4:13pm

Hi,

Am a bit puzzled by why the MMP Molecule Fragment (RDKit) node yields an empty table from approx 7000 hERG inhibitors that I pulled from ChEMBL. The node chews on the data set for quite some time, is stuck at 86%, then finishes…but without any results or error messages. All output is directed to the second port and states ‘No fragmentations generated’ for all compounds. I use the ‘non-functional group single bonds’ setting, rest is default. As far as I know all my structures are devoid of counter ions, unique, and have unique identifiers.

The version I have is 1.26.0.v202003171242 running on Windows. I noted that the node itself is plain yellow, i.e. it doesn’t have the red-green-blue diamond graphics, while the succeeding Fragments to MMPs node has this.

Grateful for feedback. Thanks/Evert

s.roughley · June 24, 2020, 4:25pm

Hi Evert,

That certainly sounds like strange behaviour - would you be able to share your workflow so I can take a look at what is happening?

Steve

evert.homan_scilifelab.se · June 24, 2020, 6:21pm

Sure thing, here it comes. I noted that the node is struggling for a long time at 48%, unsure why.

Thanks/Evert

MMPA_problem_200624.knwf (9.3 KB)

elsamuel · June 24, 2020, 7:59pm

Can you include the actual data?
As it stands the Table Reader node is looking for a table that exists on your computer.

evert.homan_scilifelab.se · June 25, 2020, 6:25am

Sorry, please try the attached table as input. Thanks/Evert

herg_MMPA.zip (1.1 MB)

s.roughley · July 2, 2020, 7:51am

Thanks @evert.homan - sorry, I dropped the ball with this one. The failure is because the number of cuts is set to 10:

If you change it to 1 (which I assume is what you wanted), then you get ~180,000 rows in the first output table:

A couple of points - when the node was hanging on 48%, if you look at the node view (Right-click on the node and select View: Fragmentation Progress then you can see that the buffer of processed rows has filled up - this happens when there is row which is taking a lot of time to process. In the example below, the pending queue is empty - i.e. all the rows processed have been added to the output table:

Also, you dont actually need the RDKit from Molecule node - the MMP Molecule Fragment (RDKit) node will do the conversion from any of the common formats for you directly.

Hope that helps,

Steve

evert.homan_scilifelab.se · July 2, 2020, 11:19am

This certainly helps, I can now reproduce your first output of 180K rows, and feed it further into the workflow.

Much appreciated/Evert

system · April 21, 2023, 9:47pm

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.