PDB connector returns Bad Request

Hello,

Firstly, I'd like to thank Vernalis for its nodes. They are vey helpful, in particular the PDB connector node.

Then, I'd like to mention this error I get using precisely this node. My query is:

<orgPdbQuery><queryType>org.pdb.query.simple.ChemSmilesQuery</queryType><smiles>[O-]C(=O)C([NH3+])Cc1ccc(cc1)OC(F)F</smiles><searchType>Similar</searchType><similarity>0.7</similarity><polymericType>Any</polymericType></orgPdbQuery>

The resut count should be 92882, but the node fails: ERROR     PDB Connector     Execute failed: Bad Request

If I use an higher similarity:

<orgPdbQuery><queryType>org.pdb.query.simple.ChemSmilesQuery</queryType><smiles>[O-]C(=O)C([NH3+])Cc1ccc(cc1)OC(F)F</smiles><searchType>Similar</searchType><similarity>0.9</similarity><polymericType>Any</polymericType></orgPdbQuery>

I don't have the error, but there is 0 result...

So, I cleary understand that if there is too many results, the node can't return all of them, but you will certainly agree this shouldn't be a problem. There are more and more structures and the queries will tend to return more and more results.

Interestingly, I have noticed this error with an old workflow I made 6 monts ago. At this moment, I did not get any error using the same input.

Do you know a work around that avoids me to delete some data?

Thank you

 

 

Nico,

Thanks for the query - glad you like the nodes!  There definitely shouldnt be any problem returning large numbers of results with the node.  I have just tried your query too, at 70% similarity, and it failed after about 50% with:

ERROR     PDB Connector     Execute failed: Read timed out

I will investigate further...

How far through the execution were you getting before the node failed?

Also, could I just check, does the old workflow from 6 months ago use the same query, and fail, where at the moment it is OK with the same query, of have I misunderstood that detail?

Thanks

Steve

Hello Steve,

The node stays at 0% 4-5 secondes then goes to 30% and fails.

6 months ago, I used the same query and the same imput, the workflow worked. I can't say how many PDB codes it returned  for this specific query because the PDB connector node is part of a loop. I just can say it did not fail.

Thank you

 

OK, thanks.  The 30% part of the progress is the progress allocated to returning and writing the first output table.  I will need to check exactly how the 2nd table is generated - it is a set of calls to the reporting webservice with the results list from the 1st table, as I remember it.

Which version of KNIME are you using with it?  It may be there is a setting needed to be adjusted in the knime.ini file, or a default behaviour which has changed (I have a dim recollection of this behaviour relating to url query requests changing in one of the recent knime versions), but I will need to check if that is the case, and also whether this node is using that mechanism - I will look into this more tomorrow.

Steve

 

Hello, I am using Knime 2.7.4 on a CentOS 6.4 64 bits machine.

Thank you.

OK.  I've looked into this some more, and I too see the same problem, and it is new!  The code for this node hasnt been changed since it's initial release, prior to it becoming part of the community plugins, so I suspect that there is a problem with the PDB webservices it uses.

I'm checking with the developer who originally wrote the node at the moment, and I am also checking with the PDB - particularly as they have just retired their SOAP webservices (the node uses the RESTful webservice so shouldn't be affected by this!).

In the meantime, having looked into the source code, I can confirm the following:

  • There is no limit coded into the node as to the number of records to be returned - if you want, you can retrieve the entire PDB
  • The first ID list table is returned as the 30% exectuted status - so the query is working (also, this is how the result count is returned in the 'test query' button)
  • The second table is generated by repeated calls to the PDB report webservice - how many times this is called depends on the number of results (the query is sent as a URL of blocks of structure IDs) - the first call is working (I checked with some different queries returning fewer hits), but the next one is timing out - this is what I need to ask the RCSB PDB people about - I'll get back to you when I hear an answer from them

In the meantime, I can confirm (experimentally determined!) that the node responds to the change in the knime.ini file recommended here:

http://tech.knime.org/forum/knime-general/xml-reader-times-out

Unfortunately, this is not fixing the problem at the moment.

I will get back to you as soon as I have any more answers, and thanks again for pointing this out to us.

 

Steve

PS - Do you know exactly when it stopped working?

 

Thank you Steve for all these precisions. I've tried the solution given in the thread you mention and I confirm it doesn't work.

About when it stopped working, I'm sure it worked in August, then I didn't use this workflow until now.

We have now added a fix to the nightly build (v1.0.4).  Assuming no new problems appear with this over the next few days we will add the same fix to the stable builds too.

The details of the fix are described in the node description - and shouldn't make any difference to the user (except that the node should be more robust to network and server issues)

Steve

Steve,

Unfortunately, despite your efforts, the node doesn't work for this query yet:

<orgPdbQuery><queryType>org.pdb.query.simple.ChemSmilesQuery</queryType><smiles>COC(=O)C([NH3+])Cc1ccc(Cl)cc1</smiles><searchType>Similar</searchType><similarity>0.7</similarity><polymericType>Any</polymericType></orgPdbQuery>

Here is the details I found in the Konsole:

WARN      KNIME-Worker-1 PdbConnectorNodeModel     GET request failed for data block 1 - Waiting 0 seconds before re-trying...
WARN      KNIME-Worker-1 PdbConnectorNodeModel     GET request failed for data block 1 - Waiting 1 seconds before re-trying...
WARN      KNIME-Worker-1 PdbConnectorNodeModel     GET request failed for data block 1 - Waiting 5 seconds before re-trying...
WARN      KNIME-Worker-1 PdbConnectorNodeModel     GET request failed for data block 1 - Waiting 10 seconds before re-trying...
WARN      KNIME-Worker-1 PdbConnectorNodeModel     GET request failed for data block 1 - Waiting 30 seconds before re-trying...
WARN      KNIME-Worker-1 PdbConnectorNodeModel     GET request failed for data block 1 - Waiting 60 seconds before re-trying...
WARN      KNIME-Worker-1 PdbConnectorNodeModel     GET request failed for data block 1 - Waiting 300 seconds before re-trying...
WARN      KNIME-Worker-1 PdbConnectorNodeModel     GET request failed for data block 1 - Waiting 600 seconds before re-trying...
ERROR     KNIME-Worker-1 PdbConnectorNodeModel     Unable to contact the remote server - please try again later!
ERROR     KNIME-Worker-1 PDB Connector     Execute failed: Unable to contact the remote server - please try again later!
 

I really hope the PDB will change their parameters soon.

Thank you anyway.

Nico,

This is a long shot, but could you tell me what report options you have set?  I can run you query now with the modifications without problem with the report type set to 'Structure', so I wonder if something else is causing the timeouts.

Also, could you try running the custom reports query on this page:

http://www.rcsb.org/pdb/software/wsreport.do

(This calls the same reporting webservice at the PDB) - let's see if that works?

In the meantime, I have applied this patch across to the stable builds in the hope that it fixes problems for other users as it seems to do for us.

Thanks

Steve

Hello Steve.

As report options, I set Ligand ID, Ligand Name and Ligand Smiles.

I also tried with only Structure Summary but I get the same result...

Using the custom report web services, Step 1 and Step 2 work. I get a very loooonnnnnnngggggggg list of PDB ids, then  a very long query with the report field set on Structure.

Unfortunately I can't export the report...

I'd like to add some information about my settings. I use Knime 2.9.0 and my internet goes through a proxy. I have added the proxy information in Knime.

Thank you

Nicolas

Nicolas,

Do you know if your proxy setting have been changed since the node stopped working?  I'm not sure how the KNIME proxy settings will affect our node - I will look into this and see if I can find out anything.

Steve

Hello Steve,

In fact, we didn't have a proxy when the node worked. :s

Nicolas

Usually proxy usage is transparent to any code that uses basic Java functionality to open a connection (e.g. URL.openConnection).

As far as I can see, your initial query is working (the node should be getting to 30% execution), and then failing when it retrieves the results.

The intial query runs using

URLConnection conn = url.openConnection();

whereas the reporting part uses:

HttpURLConnection conn = (HttpURLConnection)url.openConnection();

Maybe this is the source of the difference?

Steve

 

Is there anything I can do, to confirm this is at the origin of the problem?

The cast in the second case cannot be the root cause. Just because you are casting an object doesn't make it behave differently.

I have added additional debug output to the warning message. Maybe you can re-try with the trunk/nightly build in about 15 minutes and post the output from the knime.log.

Thanks.

Maybe also you could try using an XML Reader node with the 'Selected file' set to:

http://www.rcsb.org/pdb/rest/customReport.xml?pdbids=1stp,2jef,1cdg&customReportColumns=structureId,structureTitle,experimentalTechnique

and see if that works (this is the same call on the same service, using the XML reader node) - that way hopefully we can see whether the problem is PDBConnector-specific, or more general in KNIME?

Steve

@Steve: I'm not I understand what you want me to do. I have saved the content of the link in a xml file and read it with the XML reader node. The content of the node is attached. It seems correct.

I have updates the Vernalis nodes using the nightly build. Here the message from the konsole:

ERROR     PdbConnectorNodeModel              Unable to contact the remote server - please try again later!
DEBUG     PDB Connector                      reset
ERROR     PDB Connector                      Execute failed: Unable to contact the remote server - please try again later!
DEBUG     PDB Connector                      Execute failed: Unable to contact the remote server - please try again later!
java.io.IOException: Unable to contact the remote server - please try again later!
    at com.vernalis.pdbconnector.PdbConnectorNodeModel.execute(PdbConnectorNodeModel.java:232)
    at org.knime.core.node.NodeModel.execute(NodeModel.java:713)
    at org.knime.core.node.NodeModel.executeModel(NodeModel.java:556)
    at org.knime.core.node.Node.invokeNodeModelExecute(Node.java:1069)
    at org.knime.core.node.Node.execute(Node.java:924)
    at org.knime.core.node.workflow.NativeNodeContainer.performExecuteNode(NativeNodeContainer.java:418)
    at org.knime.core.node.exec.LocalNodeExecutionJob.mainExecute(LocalNodeExecutionJob.java:98)
    at org.knime.core.node.workflow.NodeExecutionJob.internalRun(NodeExecutionJob.java:182)
    at org.knime.core.node.workflow.NodeExecutionJob.run(NodeExecutionJob.java:113)
    at org.knime.core.util.ThreadUtils$RunnableWithContextImpl.runWithContext(ThreadUtils.java:331)
    at org.knime.core.util.ThreadUtils$RunnableWithContext.run(ThreadUtils.java:207)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
    at org.knime.core.util.ThreadPool$MyFuture.run(ThreadPool.java:123)
    at org.knime.core.util.ThreadPool$Worker.run(ThreadPool.java:238)

 

I hope this help.

Sorry if I wasn't clear what I meant.  I meant you to point the xml reader node directly to the link, and see if the xml reader node worked.  I assume that you mean that you followed the link in a browser, and saved the xml file locally prior to pointing the xml reader at it?

Steve