CSV Reader Detecting False Date Format

Hi everyone:

I am building a data workflow based off Adobe Analytics reports.

The first batch of data I received was a backfill and then I was set up to receive weekly emails for the previous week.

Both the original backfill data and the weekly automated reports contain a date field in a M/d/yyyy format. However, when I add the automated reports to my file folder and reset my workflow, KNIME reads the date as ISO yyyy-mm-dd, which breaks my subsequent string to date node. Only if I manually open the file in Excel and re-save will it finally detect the correct date format. Any suggestions how to fix this issue?

Hello @Kit,

could you please share a small workflow where you show which nodes you use and how the data look like? It would help me understand better how to solve the issue.

Thank you and have a nice evening,
Raffaello Barri

1 Like

Hi @Kit , when you say KNIME reads the date as ISO yyyy-mm-dd, what node is it that is doing that attempted conversion?

Given that you have two very distinct and identifiable date formats in the one column, it is possible to configure a single String to Date&Time node to handle both formats, using the “optional” date time format markers (square brackets).

The following format mask will correctly identify the dates for both types in the one column:

[M/d/yyyy][yyyy-M-d]

which allows KNIME to convert both, depending on which one it finds.

e.g.

8 Likes

oh, wow, I love this solution! thank you for sharing this trick. will definitely set up my string to date node to use the optional syntax.

1 Like

@takbb I received my new batch of weekly files today and was hopeful that the ‘optional date’ masks you suggested would pass the new files without having to open them individually. for some reason, the string-to-date node still failed.

Any suggestions on other workarounds? Or figuring out how to get the CSV reader to recognize the correct date formats to start with?

Thank you for any help you are able to provide!

Hi @Kit , I’d need to know the reason why it failed. It might be that there is a simple solution, but without any other information, I cannot make any suggestions… for example, maybe you have a third date format, or maybe you have something in the CSV that isn’t a date at all?

What were the dates it failed on, and what error message did it give?

1 Like

Here are the steps:

  1. CSV Reader - at this point the values in the date column are mixed as it includes both blanks and the column headers that say “Day” since it’s reading from a file folder with multiple files
  2. Joiner (x2) to grab additional columns from similarly structured files
  3. Missing value to remove blanks
  4. Row filter to remove “Day” pattern
  5. String to date to convert M/d/yyyy

I have to stress…there is only one date format in these files: M/d/yyyy. KNIME is incorrectly identifying ISO dates where they don’t exist. That’s where I’m getting hung up – there’s no real reason it should fail.

Hi @Kit, Could you perhaps upload acsv file containing all the dates as they have been read in by the CSV Reader.

i.e. CSV Reader ==> Column Filter (selecting just the particular column) ==> CSV Writer (Write this single column out to a file)

then upload here the csv file that is written.

btw, when you use the String to Date&Time node, have you tested what happens if you turn off the “Fail on error” option at the bottom of the node’s config?

Hi @takbb! Yes, I can absolutely do that. I’ve already touched all the brand new files for this week so that the workflow doesn’t throw any errors but I can write you a CSV when I receive the new batch next Monday. What you will see in the output is ISO dates for the most recent 7 days. (Once I open each new file individually in Excel and re-save, KNIME correctly identifies the M/d/yyyy format.) Will report back.

1 Like