I have problem with using SAS7BDAT Reader node, because when I am trying to read SAS dataset, the date columns are converted to number. For example, date 02.01.2020 is reader as number 21916. I know that SAS does it sometimes because it starts counting the date from 1.1.1960 or so. However, when I open the dataset in SAS EG, I see date columns properly and with date format. When configuring the node, there is option of changing the encoding, so I guess that this may be it, but I was not able to find the encoding solving this issue. Did anybody have same issue and managed to solve it? Thank you for any advice.
As a long time user of SAS myself, I wanted to see if I could come up with something quickly for you. It turns out this is a bit more complicated than I thought it would be at first, but I think I got there.
As you noticed, SAS7BDAT files store dates as days since Jan 1 1960 (they also store times as seconds since that date). SAS Enterprise Guide does this conversion for you automatically, but the SAS7BDAT Reader does not.
In the workflow below I’ve included a couple of ways to approach this - one using a little bit of R code, and the other using native KNIME nodes. They both return the same result, although the KNIME approach involves several nodes and is a bit of an exercise in understanding how Date&Time conversion works in KNIME, among other things. (Also, it could probably be improved, I just threw it together quickly.)
At any rate, try this out and see what you think. The advantage of using a component here is that once you have a working solution, you can just share the component in your local workspace, and use it again in the future when the issue of SAS Date conversion comes up.
First of all, thank you very much for your effort. I have managed to import and run part of the workflow (it was bit of challenge to run it on company PC). However, I am not able to execute the R Snippet, even though, that I have installed R Snippet extension. I suppose, that R language has to be installed prior execution, right? Also, I am using knime 4.3, is that a problem, because as far as I know, the current version is 4.4.
Regarding the ‘Knime-based conversion’. This works for me, and converts to date format correctly, however it cant be configured and I dont know how to create, or use this node in future for such conversions. Sorry for maybe stupid questions, but I am new with knime and its still bit confusing for me.
Correct. R needs to be installed locally in order to integrate it with KNIME. Here is link for R Integration but if you are not familiar with it and if it’s not used within your company don’t think there is much sense to use it only for this task. However if you have time and want to learn new skill R is a good choice
Looking at attached workflow (and the one I provided at the bottom) you should be just fine with 4.3 version.
Thank you very much for an enhancing answer! I am really surprised by the willingness and activity of local forum members.
I have tried to play with the component you shared above, however it does not work for me as it should. When I import the workflow, I have following error message:
Then, when I try to execute the component, the workflow stops on “column rename” node, with Log message:
WARN Column Rename 3:14:0:30 Duplicate column name ‘Start Date’ at index ‘0’ and ‘2’
seems configuration from Table Manipulator regarding included (filtered) columns is not restored when opening workflow with 4.3. version. That causes duplicate column names. To fix it you only need to check (include) SAS Date Start(shifted) column.