I have to pcroceed XML files. I found a topic in the forum where the XML file was converted to JSON with XML to JSON Node.with JSON Path Node I can extract my data of interest for one file. In this case for example the data ist in JSON array $[‘VehicleReport’][‘ComponentList’][‘Component’][43][’…’]. No I want to read in several files. My problem is, that the “[43]” index is dynamic so my data could be in index [39] or [67]…
I have a keyword that is on the top level of the nested JSON array. So my questions are
How can I search for the keyword in a JSON table an extract the index of the array?
How can I use this index my workflow found in JSON Path Node to extract my arrays I’m interested in?
The hub repository of @mw seems to contain various examples that deal with the unwrapping of XML and JSON files into tables. Maybe you try to further explore there also:
thank you for your reply. I will try to create some example data that I can share in the forum.
Until I can share some test data I will show you a screenshot of a XPath Node which shows the content of the XML in a better way.
As you can see the sctucture of my XML files they have variouse times the same node names “Component” in the XML Cell Preview. When I extract them the XPath Node give the XPathe Query a numerical index. This index for my keyword “VGSNAG3” can change in every XML file. So I need to know in which “Component” node I find my keyword “VGSNAG3” and then tell the XPath Node only to extract this XML node.
I hope you have now a better understanding of my problem.
Here are 3 sample XML files. Each file has included my keyword “VGSNAG3” but in each file at a different position/index. I hope you can show me a way how to find the different positions of my keyword. Sample1.xml (1.8 MB) Sample2.xml (490.2 KB) Sample3.xml (1.8 MB)
@Brotfahrer I built a demo workflow to show how you might approach this issue.
The key might be to use dynamic searches for certain values in the XML/JSON file and then continue the extraction. KNIME also offers nodes to sort of automatically extract JSON structures but they might run quite long and maybe produce results that can be hard to interpret.
OK so first you search for the cell and the path where your desired object “VGSNAG3” might be found. JSON Path should offer a broaf variety of search options. Unfortunately the sampel syntax with KNIME does not cover them to soem depth and I was not instantly able to adapt more complex search algorithms from other web sites.
You allow the search structure some flexibility. The art is later to only keep the information you want. You can extract precise values of cells, paths to later reuse or collection of these objects that would appear as lists (and you might later have to ungroup etc.).
Not: you also extract the sub-JSON block that would belong to the “ECUShortNames” as -well_ another JSON block you could deal with later.
Then you keep only those lines that have your ID “VGSNAG3”.
Then you can now approach your second (or more) JSON (sub-)object and extract the specific infgormation you want from the (presumably) ECUShortNames-block. These again can be single texts or lists or gaian nested onjects. Here you might try your hand at the other JSON nodes.
thank you for your screenshots! With your input I’m able to extract the data I need.
While it’s tedious, because of the nested “Additional Services” XML node I have to extract each of the outcoming 70 nodes with an XPath node, but it works. And that is the main thing