PDB connector returns Bad Request

Yes, that is what I did!

However, when I set the node with the link, it works too.

Great!  Thanks - so it looks like it something specific to the PDBConnector node - we'll keep looking and thinking, and hopefully get to a fix sometime soon.

Steve

OK - another update.  We think (I'm loath to say more than that after previous experiences!) that we have fixed the problem...  I have just updated the nightly build. Please could you check and see whether it works?

The details...

We believe the problem is likely to be that, to quote 'the internet', "URLs longer than around 2000 characters in length sometimes cause some proxy servers to choke".  When the node compiles the requests to the PDB reporting service, it builds up a URL containing as many 4-character structure IDs as it can fit into the URL, sends it to the PDB to get the report, and then repeats the process until all the results are processed (that is what is happening for the increments in progress between 30% and 100%).  We had this set to a maximum length of 8000 characters, which is just shy of the 8192 limit set by the remote server, but we have now reduced this to 2000, which appears to work, including via a test proxy server.  This does, however, mean that there will more calls to the service for data results, which, particularly if they get checked by the robot blocker at the PDB, will almost certainly slow down the node execution for queries with many results.

Thanks

Steve

I think you have finally found a workaround! I don't have the problem anymore with my workflow!!!!

It is just slower than before but I prefer it swoler than broken.

Sincerely, thank you and thank the Vernalis team for resolving this problem.

Phew!  I'm glad we got there!

You might be able to speed it up a little by decreasing the url timeout in the knime.ini file as mentioned far up above - most of the actual reads are pretty quick, the slow bit is the timeout if the server bounces the call to the service - I would suggest, if it is set higher, setting it to around 2000 ms, but I think some experimentation will be required to optimise the time out versus the delay waiting for another attempt if the server fails it too quickly.

To get into the technicalities again, we have asked the PDB whether they would consider opening up this service to POST requests (where we could send all the PDB IDs in a single call) as opposed to the current GET system, where we have to split them into batches limited by URL length.  If they do, it should be much quicker!

I will transfer this to the stable builds in a few days time.

I should also extend thanks to the external developer who originally wrote this node, who was the person who finally cracked it last night.

Yours,

Steve

The updates are now applied to the stable builds for versions 2.8 & 2.9.

 

Steve

The node has been updated to use a new POST service, kindly introduced by the PDB following discussions with them on this topic.  See this post for further details.  The new faster service should be used by default with existing workflows.

Steve

Hello Steve,

I have just tried the new POST service and compared with the GET service. In both case I used this SMILES: [O-]C(=O)[C@@H]([NH3+])CCCNC(=[NH2+])NCCC=C

POST: 451 s

GET: 283 s

This is totally the opposite of why I expected. :(

Nicolas

 

That's strange - we've never seen it that way round here.  I just tried your query again here, and got the following:

POST:  106 seconds

GET: - Cancelled execution after ~ 1 hour

(Assuming your query was 70% similarity as above)

Steve

Maybe my proxy is the source of these results.

I have never had so much problems since we have to use this proxy.

Nicolas

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.