SAS dataset reader incorrect date format

Hello,

I have problem with using SAS7BDAT Reader node, because when I am trying to read SAS dataset, the date columns are converted to number. For example, date 02.01.2020 is reader as number 21916. I know that SAS does it sometimes because it starts counting the date from 1.1.1960 or so. However, when I open the dataset in SAS EG, I see date columns properly and with date format. When configuring the node, there is option of changing the encoding, so I guess that this may be it, but I was not able to find the encoding solving this issue. Did anybody have same issue and managed to solve it? Thank you for any advice.

Hi there @petmro and welcome to the forum.

As a long time user of SAS myself, I wanted to see if I could come up with something quickly for you. It turns out this is a bit more complicated than I thought it would be at first, but I think I got there.

As you noticed, SAS7BDAT files store dates as days since Jan 1 1960 (they also store times as seconds since that date). SAS Enterprise Guide does this conversion for you automatically, but the SAS7BDAT Reader does not.

In the workflow below Iā€™ve included a couple of ways to approach this - one using a little bit of R code, and the other using native KNIME nodes. They both return the same result, although the KNIME approach involves several nodes and is a bit of an exercise in understanding how Date&Time conversion works in KNIME, among other things. (Also, it could probably be improved, I just threw it together quickly.)

At any rate, try this out and see what you think. The advantage of using a component here is that once you have a working solution, you can just share the component in your local workspace, and use it again in the future when the issue of SAS Date conversion comes up.

2021-08-17 10_13_08-Window

SAS_DateTime_Conversion.knwf (25.3 KB)

2 Likes

Hi @ScottF!

First of all, thank you very much for your effort. I have managed to import and run part of the workflow (it was bit of challenge to run it on company PC). However, I am not able to execute the R Snippet, even though, that I have installed R Snippet extension. I suppose, that R language has to be installed prior execution, right? Also, I am using knime 4.3, is that a problem, because as far as I know, the current version is 4.4.

Regarding the ā€˜Knime-based conversionā€™. This works for me, and converts to date format correctly, however it cant be configured and I dont know how to create, or use this node in future for such conversions. Sorry for maybe stupid questions, but I am new with knime and its still bit confusing for me.

Thanks.

Hello @petmro,

Correct. R needs to be installed locally in order to integrate it with KNIME. Here is link for R Integration but if you are not familiar with it and if itā€™s not used within your company donā€™t think there is much sense to use it only for this task. However if you have time and want to learn new skill R is a good choice :wink:

Looking at attached workflow (and the one I provided at the bottom) you should be just fine with 4.3 version.

Scott created a Component which in general can be configurable but also doesnā€™t need to as in this case. It depends on (Configuration) nodes inside it if I understood you right. Here is another link to guide to learn more about working with Components (both links apply to latest KNIME version but not much is changed between 4.3 and 4.4. in them):
https://docs.knime.com/latest/analytics_platform_components_guide/index.html#introduction

One way to re-use Component is to simply copy paste it. Another (better) is to share it but you can learn more about Component sharing in above link.

I attached here modified workflow with Component which is configurable on double click for multiple SAS date columns.
SAS_DateTime_Conversion_ipazin.knwf (36.5 KB)

Also here is link to a shared Component on KNIME Hub. Just Drag&Drop it into workflow and you can use it.

And last but not least there is ticket to enhance SAS Reader to read dates properly.

Lot is said. Now is your turn to try things out! And if any issues/questions feel free to ask. Welcome to Community!

Br,
Ivan

3 Likes

Hi Ivan,

Thank you very much for an enhancing answer! I am really surprised by the willingness and activity of local forum members.
I have tried to play with the component you shared above, however it does not work for me as it should. When I import the workflow, I have following error message:

{B525933C-94AE-490A-8909-1DF2B366EADF}

Then, when I try to execute the component, the workflow stops on ā€œcolumn renameā€ node, with Log message:
WARN Column Rename 3:14:0:30 Duplicate column name ā€˜Start Dateā€™ at index ā€˜0ā€™ and ā€˜2ā€™

And notation says following:

{7FB9B161-74AA-4137-A6D4-3C25ED225E09}

Do you have any idea how could I solve this?

BR,
Peter

1 Like

Hello @petmro,

seems configuration from Table Manipulator regarding included (filtered) columns is not restored when opening workflow with 4.3. version. That causes duplicate column names. To fix it you only need to check (include) SAS Date Start(shifted) column.

Additionally I have opened, configured and exported workflow with 4.3.3 which you can give a try.
SAS_DateTime_Conversion_ipazin_4.3.3.knwf (35.5 KB)

Br,
Ivan