bring different data sets to the same length

Hello all,

I have multiple data sets, with a different number of data points. However, they should actually all have the same length.

Since I am interested in the course of the data, it is not possible to simply shorten or extend the data with NaNs.

I am looking for a way to either stretch or compress the data to get it the same length without changing the course.

Is this possible in Knime?

Thanks you in advance!

BR Joris

Hi @Jorisr10. Sorry, what do you mean by " I am interested in the course of the data"?. Would it be possible to under/oversample your data?

1 Like

Hi @iperez

Thanks for your fast reply!

So what i meant to say is that the course of the data shouldn’t be changed through the action. So for example when i plot the data points over time the view of the graph should stay simliar but with the same length of data sets.

If i would shorten the data for example, it would be possible, that valuable Information is lost.

Is it now better understandable?

BR Joris

Hello @Jorisr10,

can you give example with data? That should help us understand it better and avoid guessing games :wink:

Br,
Ivan

2 Likes

Hello @ipazin,

sure! Here are to example datasets, that i would like to bring to the same length.

Data.xlsx (36.8 KB)

It is not possible to cut one of and also not possible to fill it with random numbers or zeros.

It would be nice if additional datapoints could be interpolated between existing data points, but i dont know if this is possible in Knime.

BR Joris

Hi @Jorisr10,

looking on the data you have duplicates in each Dataset. Do you have a reference column for e.g. relative Timescale which identify each value?
If not i have my doubts that you can manipulate this data in a meaningful way.

BR

1 Like

Hi @morpheus,

yes , there is a different timestamp for each value.

BR
Joris

Hi Joris,

could you please provide each of your dataset in your example together with their time scales?

My suggestion would be on the 1st step to combine both timescales via concatenation and make them unique by removing duplicates or groub by node. As 2nd step i would join both datasets to the unique timescale with a left outer join. Now you have all data on the same scale.
You can do an interpolation for missing datapoints, but this depends on your data are they on a linear scale or not. I personally would be very careful using interpolations.

BR

2 Likes

Hello @morpheus,

the problem is, that the timescales are not the same…

Here are the datasets:

Test.knwf (92.8 KB)

Br
Joris

Hi Joris,

just set your time scale realtive to first record for each dataset bring your time scale in a comparable scale. You can use Date&Time difference node for calculation.

BR

Hi Joris,

and here is the modified example.

BR
Test.knwf (132.8 KB)

1 Like

@Jorisr10. This also worksSameLength.knwf (281.2 KB)

This topic was automatically closed 182 days after the last reply. New replies are no longer allowed.