Array values to columns and assign values from second array

After reading an XML file, I am faced with the problem of having to restructure data because the XML file standard has changed, but the database structure cannot change with it.

This is a real example from the music industry. Each song has several identification numbers, which have to be archived in a cleanly structured way together with the title, artist and producer.

After reading the XML file, the structure looks like this:

The ID_TYP_KNZ column contains an array with the names of the individual identifiers and the ID_WERT column contains a second array with the corresponding identification numbers.

Finally, the structure should look like this:

The values from the first array, the TYP_ARRAY, are used as column headings and the values from the second array, the VALUE_ARRAY, are assigned to the respective TYP columns.

But it gets a little more complicated. Unfortunately, the TYP_ARRAYs are not stringent. The order of the individual identifiers varies and almost all identifiers are never supplied. The values of the TYP_ARRAYs must therefore be carefully compared with the columns. If TYP_ARRAY[2] is matched to the second column, as can be seen in the fifth and sixth row with EAN_UPC, for example, VALUE_ARRAY[2] must be written to the cell in the second column.

What does this workflow look like in KNIME?

Could you please share the starting table in a workable format, preferably the actual workflow (alternatively through a table reader ) to maintain the data types. I’m afraid you won’t find a lot of people willing to help if they have to re-create that themselves😉

1 Like

Hey @ArjenEX , you’re right! I hadn’t even thought about it. Here is the table with the first 90 lines

dataTable_organizeIdentifier.xlsx (13.7 KB)

I found a way to solve the problem, but no time to publish it yet. What I would like to do to complete this post.
So this comment is just to make sure the ticket doesn’t get closed.

1 Like