Since the EMBL-EBI nodes for extracting data from chEMBLdb are no longer working I was referred to using the REST nodes. However I find the examples provided on the example server too complicated. What I need is a simple workflow that will collect all compounds (IDs, structures) that have been tested against a certain target together with their measured activity, irregardless of which assay that was used (I can filter that out later).
Any help appreciated!
In this blog we describe how to create a simple REST service to connect and extract data from ChEMBL using its API. We also explain there how to create a set of calls to access various data in ChEMBL. We published it in the workflow 50_Applications/30_RESTful_ChEMBL/01_ChEMBL_REST_Services on the EXAMPLES server. We used ChEMBL API documentation to create these calls. It’s available here .
To extract compounds tested against a target you need to perform a bioactivity call, which is chembl/api/data/activity. In the mentioned example workflow, the ChEMBL Bioactivity metanode is making this request. You could feed it with a Target ID and then extract up to 1000 activities at a time, that’s a limit of ChEMBL API. To get all bioactivities you would just need to incorporate this request in a loop.
Hope it helps.
Thanks for your reply. I read the blog before asking the question, it’s just that I don’t find this example so ‘simple’. Do I really need all these metanodes?. And if I build this into a loop how can I make sure it will not just pull up the same 1000 rows 10 times?
Apologies for that. I was refering to the beginning of the post describing how to define the URL in general and then use it in the GET Request node.
I think you don’t need all the metanodes for the Example workflow. The easiest way is to adapt the available bits and pieces to serve your task: the Input ChEMBL ids wrapped metanode and the ChEMBL Bioactivity metanode.
First create a proper URL to get the count of the activities reported for your target. The link would look smth like this if you are extracting JSON as the output:
It took some time to figure that one out by trial-and-error approach while studying ChEMBL API and their Web Services specifications and examples
Once you know the count you can use it to specify the offset (rows to be skipped) in the URL used in the ChEMBL Bioactivity metanode. E.g. the link below will allow to skip the first 1000 rows and will extract the next 1000:
This way you make sure to extract new information each time.