data mapping of csv files with different time stamp information

Dom321 · July 28, 2018, 9:16am

Hi everybody,

i am quite new to KNIME.
As first use case i want to map 2 csv files data with different time stamp in one file under one time stamp. See:

Does anybody know what will be the right node and how to configure?

Thank you very much & best regards
Dom

qqilihq · July 28, 2018, 2:51pm

Well, if the timestamps do not line up, you’ll have to round them before joining the tables. (probably you’ll need to play with the optimal rounding setup, e.g. it might be that rounding/binning to half-minutes might make sense)

You’ll need some processing steps. To start, I’d probably try to convert the date to a timestamp, and then use the Math Formula node for rounding, then join using the Joiner.

[Edit] Oh! There’s actually a node from @muthmann for that which should do exactly what you’re looking for:

https://nodepit.com/node/de.cyface.timestamp.TimestampAlignerNodeFactory

Dom321 · July 29, 2018, 7:13am

thanks a lot, that node sounds perfect I have installed the NodePit für Knime 3.5, but i couldn’t find the Timestamp aligner in the Node Repository. Do you have a idea what the issue could be?

qqilihq · July 29, 2018, 9:19am

You’ll have to install the corresponding nodes (i.e. the Cyface Nodes) explicitly. The NodePit plugin provides “just” the search functionality on NodePit.com.

On the Timestamp Aligner page scroll down and you’ll find a link to the update site. Click the button beside the link and the installation process should start.

Iris · July 29, 2018, 9:34am

Hi all,
this seems like a really nice node idea!

I just played with it, and it seems it is still under development. Because it did not assign to all times from the first table a time from the second table. The last row was skipped.

Best, Iris

qqilihq · July 29, 2018, 10:45am

I’m sure @muthmann will be eager to fix this

muthmann · August 6, 2018, 6:43am

@Iris Sorry for the delay I was on holiday without internet access. Do you have some example data or a screenshot of what you mean by “skipped”. Then I could probably fix the issue you mentioned.

Iris · August 6, 2018, 6:57am

Sorry, I already uninstalled it.

The input table has one more row than the output table. Let me know if you can’t reconstruct than I can make you a workflow.
Best, Iris

beginner · August 6, 2018, 9:13am

If both files have the same amount of records simply sort on the timestamp column (maybe not even needed if already sorted at least it looks that way) and then join on rowid (note: if you sort in knime you will need to create new rowids for both tables).

muthmann · August 9, 2018, 9:14am

Hey,

Thanks for your feedback.

I just tried the following:
Table 1 (High Frequency Data):

RowID	High-Freq-Data
Row0	1
Row1	2
Row2	3
Row3	4
Row4	5
Row5	6
Row6	7
Row7	8
Row8	9

Table 2 (Low-Frequency-Data):

RowID	Low-Freq-Data
Row0	1
Row1	5
Row2	7
Row3	12

Results (as expected) in:

RowID	High-Freq-Data	Low-Freq-Data
Row0	1	1
Row1	2	1
Row2	3	1
Row3	4	1
Row4	5	5
Row5	6	5
Row6	7	7
Row7	8	7
Row8	9	7

Second variant I changed the high frequency data to:

RowID	High-Freq-Data
Row0	1
Row1	2
Row2	3
Row3	4
Row4	5
Row5	6
Row6	7
Row7	8
Row8	9
Row9	10
Row10	11
Row11	12
Row12	13
Row13	14
Row14	15

and got as a result (also as expected):

RowID	High-Freq-Data	Low-Freq-Data
Row0	1	1
Row1	2	1
Row2	3	1
Row3	4	1
Row4	5	5
Row5	6	5
Row6	7	7
Row7	8	7
Row8	9	7
Row9	10	7
Row10	11	7
Row11	12	12
Row12	13	12
Row13	14	12
Row14	15	12

Am I doing anything different here, then what you did? Do you expect the node to behave somewhat different?

Iris · August 10, 2018, 6:52am

I did miss the “closest previous” terminology. So for some of my times there were no previous. I would guess the output is than a missing value and not a missing input row.