Merge sorted tables

I have two tables which are sorted. I would like to merge them using one of the columns as the merge comparison value. This does not seem to be an option in Knime. Is there any way to do this?

Thanks!

I assume, you have already tried the Joiner node and it didn’t really fit?

Do you mean Concatenate and Sorter?

I have 100k lines in each of two tables, both with timestamps. They are already sorted. I would like to merge the two tables, Combinator-style, but with the files interleaved by timestamp. It is the merge phase of a sort-merge algorithm.

Here is an example:
File 1:
2009-10-15T00:00:01Z,queryResultCache
2009-10-15T00:00:03Z,queryResultCache
2009-10-15T00:00:05Z,queryResultCache

File 2:
timestamp,elapsed
2009-10-15T00:00:01Z,5
2009-10-15T00:00:02Z,4
2009-10-15T00:00:04Z,7

Joiner (Full Outer Join) output:
“row ID”,“timestamp”,“elapsed”,“cache”
“2009-10-15T00:00:01Z”,“2009-10-15T00:00:01Z”,5,“queryResultCache”
“2009-10-15T00:00:02Z”,“2009-10-15T00:00:02Z”,4,
“2009-10-15T00:00:04Z”,“2009-10-15T00:00:04Z”,7,
“2009-10-15T00:00:03Z”,“queryResultCache”
“2009-10-15T00:00:05Z”,“queryResultCache”

Both file outputs have RowID nodes to copy the ‘timestamp’ field to the RowID. (Note that the rows unique to the second table have the timestamp ripped out. This is fixable but still seems wrong.)

Note that the table inputs to the join are sorted, but the joined output is not interleaved by the RowID values. This also happens using integer values for the RowID.

Most of my samples are irregularly sampled over time. Another way to solve this would be a ‘time-based resampler’: it would average the irregular samples to a fixed term interval. With this, I would average both tables to a 1-second interval and join that. I would actually prefer this for other reasons.

Again, thanks for your time and attention.

Jack

Hello,

have a similar problem here. Lots of samples (time, float1, float2) in two different files. I would like to merge and sort so that they can be displayed in the same graph as one line.

Thank you in advance

Steffen

As Jack says the current Joiner merges on RowIDs (at least that’s the join column from the second input). Note if you merge time stamp columns with row IDs, you mix types (time stamp vs. strings), so the output could be – and obviously is – undetermined (the times are broken down to strings, which could be a different string than you see in a table view, for instance). If you have both time stamps as row ID, you should be save.

We are currently developing a new Joiner, which does not have the RowID limitation. It also has all sorts of other gimmicks, including composite keys. This node will come with v2.2.

Hope this helps,
Bernd

What about merging 2 tables that have multiple columns BUT some rows have the same unique ID (i.e. row 3)?  I can do this with a full outer join followed by multiple column mergers but is there a simpler solution?

Table 1

Unique ID      Value1     Value2

1                       ABC         DEF

2                       GHI          JKL

3                       MNO        PQR

Table 2

Unique ID      Value1     Value2

3                       MNO        PQR

4                       STU         VWX

5                       YZA         BCD