Compare 2 Excel tables- node like "Pivot"?

Hi *,
recently i started working with KNIME and I try to compare 2 Excel-Sheets (old & new planning).

  • Same coloum-structure (incl. header)
  • different number of rows
  • varoius articles in certrain groups have diffeerent amounts in several (coloums)

In Excel I would have given each of the tables a date-attibute, copied them together and use the pivot-function to compare the tables.
As a result i like to know the differences between the tables:

  • which articles were in the last version and have “vanished” in the new one,
  • which new articles are in the new tabel
  • how did the amounts change

==> Is there a easy “one node” solution in Knime or do i have to use different joins (for new entrants …) and math engine (for calculating differences) after joining the tables?

==> Can Knime extract the excel folder name? (Different Folders in one .xls - each named “jan” “Feb” Mar" Apr"…?)

Thanks a lot in advance for your help and please give me a hint if that question already was answered…

Best regards,
fff01

Hi @fff01,
For 1 and 2 I would suggest you use the Reference Row Filter node. The third task is a bit more complicated, but you can also use KNIME’s Pivoting node in the same way you suggested for Excel. Just append a column to each of your tables using the Constant Value Column , then use Concatenate to put both into one table, then pivot accordingly.
Kind regards
Alexander

1 Like

wow - thanks for the quick reply! I am impressed.

1 Like

Hi Alexander,
thanks for the quick reply.
I tried the solution, but encounter the problem, that KNIME doesn’t complete the pivoting.
it gets stuck at 43%. What could be the reason?

Thanks in advance & Best regards,
fff01

Hi @fff01,
I never had the issue before, that is strange. How large is your data? Could you send me the output of your KNIME log (View -> Open KNIME log). Maybe we find something there. If not, it would be really helpful if you could send me your workflow (if the data in it is not confidential).
Kind regards
Alexander

1 Like

Hi Alexander,
thanks for Your help. (un)fortunately Christmas holydays gave a break to my task.

So first “a healthy & successfull new year” to you!

And then an answer to your question: the console reads
“ERROR Pivoting 0:34 Execute failed: GC overhead limit exceeded”.

In the “Memory policy” i chose “keep only small data in Memory”.
The tables I try to pivot have ~3.000 lines.

What could I do to reduce the data?

Thanks a lot in advance & best regards,
fff01

Hi,
3000 rows should generally be no problem at all. I do that all the time. Have you selected the appropriate columns in the Group, Pivot and Aggregation tabs in the Pivot node’s dialog? If there are too many values selected, this might cause the issue (though even that would be strange for only 3000 rows). The error indicates that there was not enough main memory for the processing. Have you tried increasing the memory available to KNIME? The first point in this blog post might help you with that.
Kind regards
Alexander

Thanks! I’ll try the Group / Pivot & Aggregation…

Hi Alexander,
Thanks once more for Your quick, helpfull advice!
My setting for “aggregation” was wrong - now it works.
:slight_smile:

Best regards,
fff01

1 Like

Hi,
Glad it worked out! And a happy new year to you, too!
Alexander

This topic was automatically closed 182 days after the last reply. New replies are no longer allowed.