Mess with datetime format

20210710 Pikairos Mess with datetime format.knwf (35.8 KB)

Enjoy your weekend @Karlygash !

Best

Ael

3 Likes

@aworker , @bruno29a thank you :green_heart:

As usually, helped me to solve this problem :star_struck:
@aworker , Lookup table really useful when I have other mess of date format :blush:
@bruno29a thanks to your solution as well :blush:

2 Likes

Hi @bruno29a

My apologies, you are right and I have modified my previous answer so that it doesn’t mislead in future other readers. Your solution is much simpler and hence the best.

Please @Karlygash validate @bruno29a’s answer as the solution. It will help people to find the best one in this case. I’m still leaving mine so that people may have a second option as stated by @Karlygash.

As the saying says: “Take your time if you are in a hurry! :wink:

Thanks @bruno29a & @Karlygash

Best

Ael

2 Likes

@aworker thank you)

Have a very enjoyable weekend :hugs:

1 Like

Hey @aworker , there’s nothing to apologize for :slight_smile:

I just wanted to make sure that users know that MMM supports Jan, Feb, etc, and that in this case, given the data, it made sense to use MMM.

As we always say, there are many ways to do things in Knime, and what you suggested is also a way to do it. Moreover, it also provides an alternative, should other users have other date formats, for example 01-January-19, where they can adapt your solution and map January, February, etc… to either Jan, Feb, etc or to 01, 02, etc… and use MMM or mm.

All good :+1:

2 Likes

Good to know, uff :relieved: !

I feel relieve and I can now sleep as a log haha ha (et même dormir sur mes deux oreilles :laughing:)

Thanks @bruno29a !

@bruno29a & @Karlygash have a nice weekend,

Ael

1 Like

En même temps, ou un côté à la fois? :slight_smile:

1 Like

Haha ha :smile:

Bon weekend Bruno !

Ael

1 Like

Looks like I missed out on all the fun! :wink:

This was my take a couple of months ago (just for a laugh!) on something that could handle a wide variety of date formats provided that the data was presented in a known order. Obviously it is locale dependant and doesn’t internationalise well in its current form, but who knows, maybe one day I’ll try turning this into an actual node :slight_smile:

It successfully converts all the dates in the annotation on the left (and more besides!)

It also includes a US variant:


Well… I needed something to do :wink:

5 Likes

Hi @takbb

Amazing your solution. It is the “Rolls-Royce” node for DateTime format interpretation :smile: :+1: !

I’m always fascinated by what can be done using Regex expressions.

Thanks @takbb !

Have a very nice weekend !

Ael

1 Like

Hi @aworker, haha - I built a Rolls-Royce but I think maybe it still needs some repairs to the bodywork :wink:

My idea though would be to produce a date or datetime node that takes the pain out of date manipulation. The existing java mechanism is far too prescriptive and puts the burden too heavily on the user. There really is no reason why it cannot be simplified a little (within reason) as this (actually quite unintelligent) String Manipulation shows.

My idea is that the user should merely have to specify the order of the date/time elements, and not have to supply the actual mask, so for example, if they say that the order is Day Month Year, it should match
01 FEB 2021
1 Feb 2021
1 feb 2021
1/2/21
01/02/2021
1st February 21
and so on. In the second decade of 21st Century, in a world where computers can recognise faces, why should a user have to perfectly describe the format of date being used? Is the ordering of the elements not sufficient? :thinking: There really is no reason why with a small amount of coding effort that isn’t achievable, so … one day… maybe… (or one date :wink: )

Have a good weekend too! :slight_smile:

4 Likes

thanks, have a very nice weekend ! :hugs:

2 Likes

Good day)
How are you?
I have one case related to the date format.
Format is given in the following way: 01.01.18 (string format). After converting string to date&time it converts to 01.01.2018.
But why knime is taking 2018, instead of 1918? Does it look at the current year?

Sample I have attached in here
KNIME_project4.knwf (18.7 KB)

1 Like

Hi @Karlygash , Knime is probably looking at the current century rather than the current year, and I don’t see what else it can do.

In your case, since it’s more of a log, it means that the dates cannot be in the future. Therefore, you can apply additional logic, such as if new_date > current_date then new_date = new_date - 100 years.

So, “2097-09-21” is > that current_date, then “2097-09-21” should become “2097-09-21” - 100 years = “1997-09-21”

I put something together for you based on this logic. I added these to your workflow:

And it yields these results:
image

Here’s the workflow:
datetime format for 19XXs.knwf (19.6 KB)

5 Likes

great solution as always :green_heart:

1 Like

I must confess to being very surprised by KNIME’s handling of 2 digit years in this regard, as it is a different scheme to Java on which I had assumed it would be based. It is also different to Excel which is similar but subtely different to Java.

Take a look at the table beolw of 2 digit year dates entered in Excel. The second column is Excel’s interpretation of those dates using the Excel formula DATEVALUE()

image

Processing the first column in KNIME using String to Date&Time gives the third column, and the fourth is using a Java snippet node using a simple Date from String conversion:

The java snippet used the following code to simply build a Date from the String:
image

As you can see none of the interpretations are exactly the same, I have marked in red where one of the interpretations is in the minority The java date always agrees with one of the others, and the KNIME handling of dates in the 41-99 range stands out as unusual.

It appears that the rules are:
KNIME - all 2 digit dates are considered to be this century (I’m basing this assumption on the observations above and haven’t seen it documented)

EXCEL - Uses the 2029 rule: All 2 digit dates between 00 and 29 are considered to be 20nn. All two digit dates between 30 and 99 are considered to be 19nn

JAVA - I think uses the 80-20 rule. The date range for 2-digit dates is derived from the set of dates 80 years prior to current date through to 20 years after current date. So a 2 digit date interpreted in 2021 could (on the boundaries) give a different result to one interpreted in 2020 or 2022.In 2021, the 80-20 year span is considered to be 1941 to 2040, so years in the range 41-99 and 00-40 are in the range 1941-1999 and 2000-2040 respectively.

So… the bottom line here I guess is that when it comes to two-digit years, you need to make sure you know what range YOU understand them to mean, and check that they are interpreted correctly in your workflow. Remember that the interpretation can differ according to the node used. Ideally, have the dates corrected at source, and never intentionally create them, so that there is no ambiguity.

And my final thought is that I really don’t envy all the future KNIMERS waking up on
1 Jan 10000 … Fixing the old data for Y2K was bad enough!

2 Likes

Good topic of discussion @takbb . The problem is that there is no standard answer to this. Different systems/applications/sites (not just these 3 you mentioned, but thousands of others) make their own rules about the 2-digit year: which century does it belong to?

The Excel range is quite restrictive. That’s only 8 years from now for future dates. Imagine doing a lease of 10 years, or mortgage of 15 years. Heck, the government has announced that no fuel-propelled cars will be sold starting 2030, you can’t even enter that year as 2-digit year in Excel.

The best way to handle this in my opinion is to use 4-digit year and keep data with 4-digit year.

4 Likes

KNIME actually follows the Java approach. Internally it uses java.time which supersedes the java.util date functionality. From the documentation:

Year : The count of letters determines the minimum field width below which padding is used. If the count of letters is two, then a reduced two digit form is used. For printing, this outputs the rightmost two digits. For parsing, this will parse using the base value of 2000, resulting in a year within the range 2000 to 2099 inclusive. …

This might seem weird, but it is actually the best approach, because it cannot do better. It makes a call and a rather easy to remember one.

Take away: Fix your wonky, ambiguous legacy date format before using it. “Explicit” trumps “implicit” every time.

3 Likes

Indeed. As I said, ideally the dates should be fixed at source. The big problem I think is that for convenience many systems such as excel enable entry of 2 digit years, which obviously have defined meaning in excel terms, and people make assumptions about what it means. Those assumptions aren’t always correct.

It is important that people recognise that different systems will interpret this 2 digit years in different ways and so when communicating the information to anywhere else, they should first be converted to 4 digits so that there is no ambiguity.

What people shouldn’t do is rely on the receiving system to do the translation, because its interpretation may differ. When transferring years between systems this should always be done in 4 digit format…

And in a little under 8,000 years time, people being tasked with solving the Y10K problem will say ‘what were they thinking… only using 4 digits for the year? :rofl::wink:

2 Likes

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.