I am having a date and time column where there were different date formats and this column contains around 110000 rows. I cannot change everything manually in excel and in knime, I cannot make use of legacy date time with multiple date formats.
It will try to guess the date from different formats (you need to check if it is correct date).
Or use all possible formats, concatenate the result from number of tables and filter out missing values.
Above mentioned Date Extractor from Palladian has the advantage, that you don’t need to tediously define parse patterns manually, and that it can handle varying formats automatically.
What problem are you facing when installing the Palladian nodes? Let me know and I’ll be glad to help.
… as you do with running a KNIME Server itself, which is also not “free”.
01-02-2019 which is showing in Knime as 2019-02-01 : works perfect
dd-mm-yyyyThh:mm:ss : this format also works perfect
dd/mm/yyyy: Not working
When I used String to date time, all the cells are converted except the ones in 3rd format which is why I am looking for a solution
Also, I have used “Date Extractor” node: This seperates Date field completely from the table and how can I connect again to the relevant row by using this node?
First:
If the Date Extractor node recognizes all the date formats in your data set, you can join the output with the original table based on row IDs. Notice that the Date Extractor node skips the rows with unrecognized format, so if there are any of them, and as the node refreshes the row IDs, you cannot join the tables.
As an alternative solution, since you have a few number of date formats, I think you could change all the formats to the standard format first and then use the String to Date&Time node. E.g. this expression in the Column Expressions node will convert the mentioned formats to yyyy-MM-dd format:
So the String to Date&Time node can easily convert the column type.
Here is the workflow which contains the application of both options: date_manipulation.knwf (26.3 KB)
P.S. Dear @ngumpena, if you set the output type to “Local Date” in the Column Expressions node, you do not need the String to Date&Time node anymore.
Of course, my pleasure!
As you can see we have two functions: regexMatcher() and regexReplace().
The first one recognizes the format and the second one changes the format to the standard format.
[\d] stands for any digits (we use double backslashes -which is the escape character- one for the node and one for the regex itself) and {n} specifies the number of digits. So [\d]{2} means two digits. (T.*) specifies the time part of the string and the ? character makes it optional. In the regexReplace we use parentheses around each part (day, month and year) so we can reference them for the output as $n.
You can learn more about using regex in KNIME here:
And this website has everything you need to know about regex: