Dear KNIME community,
I have a problem, for which I currently find no solution.
We retrieve updates on records from a database provider and I want to extract the changed information only.
Unfortunately with the provided API call one will always receive the complete record, in this case on a drug, instead of only the changes.
The changes are noted in a section called changedField.
To get only the changes I would need to extract the Filed Names, of the fields where changes happened, and then retrieve the information from these fields. As far as I get it some sort of variable XPATH/JSONPATH would be needed.
I have never heard of anything like this.
Any idea from you how this could be handled?
Hi @And_Z , it’s difficult to give specific ideas without being able to see the actual structure of your json, as this may dictate the approach that I or others would take.
Would you be able to upload a couple of small (demo) examples of the json
- ideally as a file, but if you paste it as text into the forum please be sure to highlight it all and press the “preformatted text” button so that the forum software doesn’t try to interpret/modify it.
i.e.
That way people can then copy and paste it directly into KNIME to try things out and then assist you. Also please tell us which version of KNIME you are using. thanks
HI
my KNIME Version is 5.3.2.
I cannot really share the full record as it comes from a database provider.
What I can share is the following
<changedField>
<item>DeliveryRoutes</item>
<item>DrugSupportingUrls</item>
<item>LatestChange</item>
<item>LatestChangeDate</item>
<item>MechanismsOfAction</item>
<item>Nce</item>
<item>Origin</item>
<item>Overview</item>
<item>PreClinical</item>
<item>TherapeuticClasses</item>
</changedField>
this is basically what you get for the changed fields, and all those fields are then existing in the rest of the JSON, with detailed information to them.
I can from this extract the field names of the changed fields and run an API call to extract these for a given drug.
The problem is just, I do not know the resulting XML strucutre and can therefore not retrieve the information entirely.
So my question is if there is a dynamic way to extract information, really any idea would help, can be independant of my example.
Without seeing a complete JSON record, I’d imagine that you could do this by first retrieving all the fields, changed or not, with the JASONPath node, split off the changeField values and feeding those into the second port of the Reference Column Filter node Reference Column Filter – KNIME Community Hub
The full JSON table would be fed into the first port of the node, and you can keep/remove only the changed columns.
Otherwise, could you retrieve one JSON record and post it here? It’s just a fancy text file, so you could manually remove sensitive data, like changing the drug name to Dihydrogenmonoxide?
(the other)
Simon
yep
i fear that is the only way, I hoped there would be a better one though.
unfortunately it would not even detail the changes…
well i guess that s the way to go.
many thanks
Hi @And_Z , firstly can you confirm if this is json or xml. The example you have given above is xml, so is there also json involved or is this actually purely xml? Whilst it’s possible to convert one into the other for processing it is easier if we know exactly what is expected from the outset.
It’s difficult to approach this problem “generically” without an example because any solution would need to be able to handle your specific case and the complexity of the json/xml would almost certainly make a difference to any solution that could be provided, and the approach taken.
eg
-
How many levels of nesting in the data, or is it a 'flat" data structure.
-
Are all the field/element names unique?
I would tend to approach this by trying to define the exact problem in terms of how it would be solved if you were having to do it manually. Can it be done? Do you have all the information you need?
If it is not possible to define the process in terms of the clear manual steps, it is unlikely (ignoring machine learning and AI) that we’d be able to find an automated solution.
If you find there are gaps requiring “magic” it’s going to be even more challenging
Without seeing or knowing your data any proposals that we suggest on the forum would have to be based on made up examples. Clearly you are best placed, based on what you know to be the one to come up with a general example data set for the problem. If you can do that it may be easier for others to assist.
So if you could post some general example data, even if it is simplified and independent of your actual data then that would help us to assist, as there is little point in other forum members trying to all think up their own example data that is totally different from your scenario. I hope that makes sense.