I have two structurally identical tables (let’s call them old and new), both created by the same KNIME workflow but with different input data. They both contain an unique ID column. First, I do a join on this ID and split the data into: IDs which appear in both, and IDs which appear only in the new/old table.
Now I want to compare the rows with identical IDs. They might be identical or they might differ in any or multiple columns. If there is any difference, I want to use the row from the new table. If they are identical, I am not interested in the row. How can I do this?
The rows do contain missing values. Concatenating every cell in a row to a string and comparing that might not be sufficient because of missing vales.
The only way I could find is using a rule-based row splitter which contains a line such as:
NOT (MISSING $column2$ AND MISSING $column2--alt$) AND (NOT ($column2$ = $column2--alt$)) => TRUE
for every column. My table has >50 columns, this is slightly unpractical.
I believe the solution you have described in your message is the right way to go. As you mentioned, one can compare full rows at once by just concatenating them and then comparing the “old” and “new” concatenated strings. The missing values should not be a problem if you make sure that you check the “missing option” when concatenating so that the missing values are also taken into account.
If you do not get it to work, please upload here an example with data so that we can help you from there.
I used the column aggregator to combine the rows. Only problem is, that missing values are represented as ?. But the Data itself might contain ? as string. e.g. there is a column which can either be empty=missing, or a single char while L#?! are the most common chars used. Any Ideas? Can I have missing values represented by something else?
Indeed @masgo, you are providing the solution again lol !
In your case, I would first replace any missing value by a very improbable string, for instance a combination of weird signs such as “%*µ%§@$£” which is certainly known to be impossible to find as value in any of your Table cells and then do the concatenation. This can be done using the -Missing Value- node with a “constant value” configuration equal to the value “%*µ%§@$£”, for instance. This should do the trick
Hi, @masgo .
Have you seen TableDIFF – KNIME Hub?
This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.