Datetime index for time series data in new Python scripting nodes

mbloechle · December 11, 2022, 1:48pm

The new python scripting nodes (as of Knime 4.7 with Apache Arrow) and the columnar table back end work like a charm, since there is virtually no processing to and from the pandas scripting nodes. This is important if you have a large number of rows (7 million in my current case). However, as my data is time series data, I still need to parse and set the index to have a proper pandas data frame with a datetime index. This again takes time and seems inefficient - am I missing something?

I set my datetime index with the following, but it takes a long time (longer than the actual operation, e.g. a resampling). My table contains one “Local Date Time” column and columns with doubles.

import knime.scripting.io as knio
import pandas as pd
df =  knio.input_tables[0].to_pandas()
df.set_index(pd.to_datetime(df['Date']), inplace=True)

According to the KNIME Python API, the function from_pandas() has a RowIDs parameter, but I don’t get it to work as expected.

Alternatively, using a RowID node to set the index from “Row0” to contain my datetime column works in a way, but my resulting dataframe index is not of type datetime. Is this a limitation or am I missing something?

I am happy for any recommendations / experiences - thank you!

aliasghar_marvi · December 12, 2022, 3:25pm

Hello @mbloechle,

Can you share what goes wrong when using the from_pandas() function?
Also for row ids there is not specific datatype, all row ids are of type “String”.

BR,
Ali

mbloechle · December 15, 2022, 9:39pm

Hi Ali, sorry for the mixup, when reading and creating the pandas dataframe I use to_pandas() of course (and not from_pandas() as I wrote). Still, using to_pandas() I need to create and parse my timeseries index first, there is no quicker way around it, correct?

Also if Knime RowIDs are always string, this means I have to create a datetime index always from string. So using the RowID node does not help me in this regard. Thank you!

system · March 15, 2023, 9:39pm

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.