Serious Error in Date%Time to string node

Hi all,
after i observed multiple issues with the aggregation of some time series data i spent the day researching the poblem What i found was a bit of a shocker for me: The Date%Time to string node transforms in some cases the date wrong.

first a quick overview how the process looks like where the error occured:

the csv data is time series data with a date/time and a corresponding value:
dataformat

It’s a pretty straight forward process the String to Date&Time just transforms the string date to the european format:
String to Date&Time

the data then get sorted in ascending order for date/time and in the end the data gets with modify timezone a timezone shift from european time to UTC and is then again transformed to a more readable format.

The error now occurs in the Date&Time to String node:
Error

in the left column TimeStamp you see the correct date from the modify timezone node which transformed the data to UTC.
The right colum is (from the Date&Time to String) several timestamps are transformed wrongly - by a whole year!

another example of this error occuring:

It seems to be happening at the beginning of the year.

Here is the part of the workflow with data to reproduce the error:

In the Date&Time to String node, you’re using Date Format dd.MM.YYYY HH:mm:ss when instead it should be dd.MM.yyyy HH:mm:ss.

6 Likes

I think @elsamuel has nailed it. Using the week-based year instead of the calendar year can produce seemingly strange results in the rollover from December to January. For excruciating detail see here:

So this is double good news: It’s not a bug, and easy to fix! :slight_smile:

3 Likes

@elsamuel Thanks for your help!! This indeed does solve the error!

@ScottF: Thanks for the link. I understand that this is not an error of KNIME but has its roots in the Java DateTime Format (at least i think that i understood that). And that it was a mistake on my side to use “YYYY”, results look good so i guess it seems OK and go… This has gotten me in some trouble.

The Thread on Stack is 7 Years old. My problem is now that i am scared that something like this happens again. This got me pretty “good” at least 2 Worfklows are affected and have calculated wrong data.

Could you or anyone else of course help me with this. I would need to understand what i need to read up to/study. My ideal would be to have a “map” in my head like: KNIME uses XYZ from Java and this i can read here and there. so that i know what builds up on what and get a better grip on this

  • How could i get an overview generally what Knime is using in regard to Date and Timeformats? Something that give me an overview /a deeper understanding

  • Or how can i Check for the nodes what they use internally? I am asking this since the Date%Time to String and String to Date&Time do Link to the Oracle Page for Java. But other nodes don’t contain that link like: Modify Timezone or UnixTimestamps to Date&Time

  • Is there another method in Knime that is more safe to convert string to dates, a bit more transparent easier to undestand what it does than java?

I am asking this because following a link from the Stack Thread you posted it says that “u” would be safer than “y” for not running “unexpected exceptions”:

It is safer to use “u” instead of “y” because DateTimeFormatter will otherwise insist on having an era in combination with “y” (= year-of-era). So using “u” would avoid some possible unexpected exceptions in strict formatting/parsing. See also this SO-post. Another minor thing which is improved by “u”-symbol compared with “y” is printing/parsing negative gregorian years (in far past).

And what i don’t want after today is more surprises with this.

Thanks for the support from the forum meant a lot to me! Without the forum i would be totally lost
And sorry that my questions are a bit vague. I never was a coder (except small VBA stuff in Excel) let alone spcifically Java but i need to get a better grip on this.

If the text is to long to read:

In short i would like to understand wether in Knime regarding date&time always the Java virtual machine is running in the background and so the defintions from oracle for java 8 standard apply. Or is sometimes something not the java vm running and other definitions apply?

For example the “Unix Timestamp to Date&Time” Node does in the help not link to the java page and one can not chose a dateformat with “yyyy” etc. Does this then automatically chose “YYYY” or how does this work then?

Greetings and have a sunny weekend!

Hi @TotalDataLoss -

You raise many interesting questions, many of which I had not thought of before. Let me see if I can succinctly reply.

The best resource I have found so far for Java datetime is the one I linked you. I think the main takeaway from that link is that generally speaking you should use ‘y’ instead of ‘Y’. (It turns out ‘u’ is also fine, but this is not intuitive so I would avoid it for that reason.) I didn’t know ‘Y’ could cause problems myself, so I wonder if we shouldn’t include a warning to that effect in the configuration dialog somewhere.

All of the “new” (meaning not legacy) Date&Time nodes in KNIME should use the same Java basis under the hood, so internal consistency shouldn’t be a problem. These new nodes are what we recommend for parsing dates and times, understanding that the java methods can be a little bit opaque, as you have found.

The Unix Timestamp node shouldn’t cause any surprises since it just counts seconds from the 1 Jan 1970 epoch, using a standard year ‘yyyy’.

In short, I think the most unpleasant surprise of ‘yyyy’ vs ‘YYYY’ is the one you have already discovered. I can mention to our team that this is something that could use more clarity.

1 Like

Hi Scott thanks for your effort. I hope it’s OK when i give a honest feedback how i feel about this.I read up on the java documentation and played around with this the last day(s) and must say all in all it doesn’t feel very safe, when this is the attitude.

The thing is i don’t want to discover more surprises with this, that you say they will be a little less damaging is not very reassuring!! I found not good how quickly you declared that this is not a bug, and forget about it when in reality Knime directs its users straigth forward to this bug and a lot of people will have wrong calculations because of this.

Generally with data analysis: The software gives you always an output it’s just often wrong, if it gives you an error message be thankful for the money that you saved. KNIME is a bit opaque in this regard it gives the big advantage of “visual programming” to the users, but if KNIME uses jvm internally it kinda has the responsibility to make it as clear as possible what is to expected from it on a node by node basis.

The error that occured changed the date by a whole year and Knime lets the user combine YYYY with an exact date and month, which is simply wrong and shouldn’t even exist.

To say how greate visual executable workflows are (and in this regard Knime is really wonderful!), but when something goes wrong then its java fault is somewhat not to helpful.

It’s not true neither KNIME creates this bug by giving the user a lot of wrong suggestions or even Autoguesses an erroneous format where the year digits created with week-based-year are combined with an exact month and day of month which introduces the bug and creates wrong results…

So if i had as a user a wish: More ownership for error-free calculations and not the approach: of course the results are wrong on page 220 of the java standard edition 8 page documentation you could have guessed why that is, when your are smart.

That when Knime uses Java standard edition 8 for a node (which is rather old, how will it be handled when Knime switches in 2 years will the nodes than just change under the hood without the user noticing?) that there is a spirit of liability for the results of this too. Becaues a lot of the users don’t even know that it uses java DateTimeFormatter (like myself in the beginnig) and what all can go wrong with this and another node might use Python or Panda or something else with their troubles. Visual programming doesn’t make to much sense when you have to be an experienced java/whatever coder to ensure that the results are right.

But since this thread is closed i will open a new thread with more on point description of what i found out…

Ok rant is over - besides this i love Knime… And have a good one!

@TotalDataLoss not sue how familiar you are with other programming languages and systems and their handling of date and time variables. From my experience I can say: they are all different and nearly all have some quirks and special characteristics especially when it comes to time zones and end of year settings, also weeks of year and so on (UNIX Timestamps, Excel, SPSS, ORACLE, Cloudera BigData, R, Python …). So if you have to deal with such variables I would always check twice if I have covered all the quirks and my system does what I want. Sometimes it also depends on extra settings within the database (or system).

Sometimes it can be best to store date and time variables just as strings or ‘readable numbers’ (20220528) and then convert them later in the system you want. And always check the fringe cases.

2 Likes

Just wanted to link to the other thread here: An example what can go wrong with DatesTimes in Knime: the String to Date&Time node

@mlauber71: Have to go now. But looks interesting will check it out later! Thanks. Just a quick reply: Excel i found the benchmark never ran into to big trouble with it.

1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.