XML like structure to groups and lists

Hi,

Firstly I have to admit, I’m a bit lost here… I’ve tried to pull settings and figured out a lot, but I have failed to keeping the settings grouped together and elements intact. I’m not that familiar with loops, but have used them a bit. This task seems a bit overwhelming and would require a lot of manual work or some clever KNIMEing with loops running loops ? I dont know anymore :smiley: Please Obiwan Kenobi…

I have a configuration files from CA UIM (a system monitoring system) and I would need to parse the settings that are in use.
It has a lot of different elements in them, templates and defaults, which are then referenced in the actual named entities(servers etc) configuration.

Most of the files have this same structure:

<setup>
setting1 = x
setting2 = y
...
setting n = n
</setup>
<section A>
   something = a
   <subsection a1>
   ...
   </subsection a1>
</section A>
<section B>
   something = a
   <subsection b1>
   ...
   </subsection b1>
</section B>

What I would like is to have all those lines grouped by different level of sections, filter some of them out.
Filtering would be best to do by the element, like “section B” completely out from the list.

I tried to group them with “Rule Engine” => Append Column : GroupID" => “missing value: Append previous value to GroupID”, but it flattened the whole tree in to single level.

So what I figured was to group those “sections” or elements by the parent element:

Example of the conf file (I reducted a bit):
Setup is just one root level element, there are a lot of them.
First thing would be to figure out how to group everything under each root level element ? Right ?
Then how to handle each group to have sub groups under them, then split them up and filter the settings and then gather them to lists ?

I would like to be able to select any element and list it’s settings, so that I can separately list any of the templates (a group of elements with subsections) and settings, and the elements that have used them (referenced).

rsp.cfg:

<setup>
   loglevel = 2
   logsize = 50000
   plink = plink.exe
   wmi = NA
   threads = 200
   consumer_threads = 100
   db_keep_hrs = 2
...
   <qos>
      <cpu>
         name = QOS_CPU_USAGE
         group = QOS_MACHINE
         description = CPU Usage
         unit = Percent
         short = %
         hasmax = 1
         bool = 0
         dynamic_variables = used_pct,average
      </cpu>
      <cpu_multi>
         name = QOS_CPU_MULTI_USAGE
         group = QOS_MACHINE
         description = Individual CPU Usage
         unit = Percent
         short = %
         hasmax = 1
         bool = 0
         dynamic_variables = cpuid,average
      </cpu_multi>
      <disk_mb>
         name = QOS_DISK_USAGE
         group = QOS_MACHINE
         description = Disk Usage
         unit = Megabytes
         short = MB
         hasmax = 1
         bool = 0
         dynamic_variables = avg_value
      </disk_mb>
      <disk_pct>
         name = QOS_DISK_USAGE_PERC
         group = QOS_MACHINE
         description = Disk Usage (%)
         unit = Percent
         short = %
         hasmax = 1
         bool = 0
         dynamic_variables = avg_value
      </disk_pct>
      <memory>
         name = QOS_MEMORY_USAGE
         group = QOS_MACHINE
         description = Memory Usage
         unit = Megabytes
         short = MB
         hasmax = 1
         bool = 0
      </memory>
...

   </qos>
   <credentials>
   </credentials>
</setup>

Another example of a service under a server, that’s being monitored. Here the root element is the server, then processes and then the actual process:

<servername>
  active = yes
 <processes>
         active = yes
         <PartialTransferService.exe>
            name = PartialTransferService.exe
            instance = yes
            qos_cpu_usage = no
            qos_instance = no
            qos_mem_usage = no
            qos_state = no
            qos_threads = no
            pid = 1160
            process_desc = Monitoring PartialTransferService.exe
            <process_owner>
               active = no
               level = 3
               msgtoken = process_owner
               threshold = NT AUTHORITY\SYSTEM
               condition = =
            </process_owner>
            <process_cpu_usage>
               active = no
               level = 3
               msgtoken = process_cpu_usage
               threshold = 0.000000
               condition = =
            </process_cpu_usage>
            <process_size>
               active = no
               level = 3
               msgtoken = process_size
               threshold = 872
               condition = =
            </process_size>
            <process_thread_count>
               active = no
               level = 3
               msgtoken = process_thread_count
               threshold = 3
               condition = =
            </process_thread_count>
            <process_instance>
               active = no
               level = 3
               msgtoken = process_instance
               threshold = 
               condition = =
            </process_instance>
            <process_up>
               active = no
               level = 3
               msgtoken = process_up
               threshold = 1
               condition = =
            </process_up>
            <process_down>
               active = yes
               level = 5
               msgtoken = process_down
               threshold = 0
               condition = =
            </process_down>
 </processes>
</servername>

Any advise would be amazing… I’m about to quit on this totally…

1 Like

oh, the formatting is totally off… I need to fix those lines

Looking at my own post … :expressionless:

But trying to put it simpler,

the last conf example, the server and that process are monitored (active = yes) , but most the the states or responses for that particular process is disabled (active = no), so those I could ignore.
Then I would like to get all the servers, with processees that are monitored and those actions/ states that are active eg. are monitored.

The same goes for all of the rest of the settings… I would think that the same data processing should work after that to any group of settings ?

Hi
well, I was wondering how to use the cascaded sections, with indented subsections… and noticed why my output was so flat ?
I had selected “ingore spaces and tabs” in the File Reader! So that’s why it grouped everything to the same level.
Now I’m trying again to get some progress in grouping them a little better.

Grouped by RootID, selecting “First” to get a name/ header for the group
kuva

sure the closing tag group be filtered out, they have only that one line anyway, but you could count the lines in each group by subtracting the groupid values :smiley:

Hi @miku ,

Have you tried to convert it to XML and then parse it with the XPath node?

:blush:

Hi,

yes I did! but I did not know how I can combine them ?

All I manged to do was to get “column to XML” and then I had separate XMLs at different levels of that structure.

Like this.

But I would like to get those values under the parent, this time “” … then repeat that at different levels.

XML Row Combiner

1 Like

Yes! I tried that one also, but by default it just adds those columns to the same level, like this:

But when I tried to put “setup” as the name / parent for the “values” xml, it gives me an error:
“ERROR XML Column Combiner 3:121:104 Execute failed: java.io.IOException: org.xml.sax.SAXParseException; lineNumber: 2; columnNumber: 2; The markup in the document preceding the root element must be well-formed.”

When trying like this:

I thought I could use this method to make, in this case, the “setup” as the parent element for the “values”

The data has multiple levels, so this could be a winner solution… if I could just figure this out :smiley:

Well. Let’s finish this task. Provide me with a sample input and your desired output table (regarding the input).

If I know what exactly the input is and output should be I can create the workflow for you.

:blush:

2 Likes

Wow! Okay, well… I don’t know if this is just total waste of your time ? As I’m still trying to figure out ** how to present it, how to filter it and so on**, but here we go :smile:

There’s a part of that rsp.cfg. The element “setup” and all under it. It has at least multiple levels, so does the rest of the file have.

sample.zip (1.4 KB)

And the output would be like (this part I was going to decide, when and if I had the data completely in XML-format)

but I would like to get the file to an intermediate XML-format first, like:

 <setup>
    <key>value</key>
    <1st_level_sub>
        <key>value</key>
            ...
        <2nd_level_sub>
            <key>value</key>
            ...
        </2nd_level_sub>
    </1st_level_sub>
</setup>

Then I would like to get some data from it, using the XPath as you suggested. That would give a flexibility to filter some levels or elements and values I need from the xml file.There are a lot of sub level settings, that could be filtered out as they are not active (active = no).

Just a table would be fine, I think I would pivot the elements into columns… but I didn’t get that far. I wanted to see the data a bit clearer first.
Anyway, that seemed like a good idea, in order to get less rows and group the settings on based on the elements. I guess I would make a sheet / page per root or 1st and 2nd level element.?

I’m really not sure - any suggestions ? what would make the most sense in your opinion ?

a thousand thanks! for all this and any advise (if you still have the energy to help out)

miku

just posting what I’m trying to do… or where I got on my own. :smiley:

I’ve got the kv-pair into the XMl, and I have their groups, level_2_IDs, level 1 groupIDs and the root.

Next I would display the column with the key of that level2 group and send them to a loop, where I would use that group name in a variable and use it in the XML Row Combiner’s root Element Name field.
I tested it out a little and seemed to work… Then I would have to figure out how to combine level 1 and 2, then those to level 0 (root) ? :expressionless:

It’s fun, but slow …

I transformed the file content to xml. Notice that there was an error in “Service_account” starting and ending tags. I modified the file before reading. If this is a routine error then let me know and I will modify the workflow to fix it automatically.

cfg_to_xml.knwf (26.7 KB)

:blush:

3 Likes

Hi, YES!

It’s much simpler than I thought! I tried to do everything with Knime nodes, but that regex -approach is much easier :smiley:

I had some problems with the whole data, element names started with a number or some special characters within the name so it would not validate, but I manged to fix them.

Now, still a question… the sample was with a root of “setup”, but in the whole document there is still other elements on the same level, so the validation on the whole data fails as there can be only one root element.

Easiest way to add a root element ?

1 Like

oh, well… I got it.

Thanks @armingrudd !! I owe you a drink or two!

M

2 Likes

I’m glad I could help.

:blush:

This topic was automatically closed 182 days after the last reply. New replies are no longer allowed.