while working on an approach to efficiently and quickly, regardless of the size of a data set, inject column types, I converted sample data to strings.
While doing so I happen to notice the Date&Time to String – KNIME Hub Node seems to struggle to convert the Column Type “Date and Time” to a string as the column is not recognized as a valid input type.
Not quite. The Date&Time to String – KNIME Hub Node does not accept any input schema like a date time stamp with time zone offset. It merely converts any given Date/Time format into a string,
it’s always been a bit challenging to understand what exactly is troubling you and where is that bug you are talking about
Date and time column you are referring to and unable to convert is legacy date and time column type (as opposed to Local Date, Local Time, Local Date Time, Zoned Date Time column types) and to work with it you should use legacy Date&Time nodes. In this case Time to String (legacy) node can convert it to string column or you should use Legacy Date&Time to Date&Time node to make sure you don’t have any legacy types. (In general if there is no strong reason for it you should avoid using legacy Date&Time column types.)
In addition Column Expressions node fails as it’s configured with a flow variable that is not available to him. (Error message clearly states that: Unknown variable “currentColumnName”). That is probably the case cause you have done copy paste of above Column Expressions node which is part of Column List Loop and there node was configured to use mentioned flow variable.
And one more thing. Don’t think you should start topics with [Bug] as not every time you configure something wrong means there is a bug in a software.
Don’t think so
Knowing how to recognize column types leaves out confusion and adding one node solves you issue. And not to mention there are many other nodes that are more used (I personally never used this node) and are missing these kind of “nice options”.
Ivan
Apologize for my late reply. Family got sick and some urgent business topics had to been addressed. Is there a way to identify legacy data types?
I can imagine, when a deprecated node is present which still generates a legacy data format, where it’s been stated that legacy nodes shall not cause any issues, it might cause confusion as the error is not clearly explained.
A log message like “Incompatible / legacy data types in Col XYZ” or something like that might be much more explanatory.
You can identify legacy types either by opening Table view and checking icons next to column names or going to Spec tab and checking Column types. In general I agree with better (more intelligent) way of notifying user what is or might be the issue
Thanks Ivan, we are all doing better now. And also thanks for the information which gave me some ideas to think about. Though, for that to be viable a node must actually be executed which is where the cat bites its tail.
The overarching idea was to, which I am progressively doing, building up test workflows to:
Show different approaches with the same results
Benchmarking these against each other with varying data set sizes
Regression testing
I’ve fallen in love with Knime many years ago and call it the Swiss army knife of ETL. Albeit the awesome example workflows provided by Knime, I feel these lack the level of diversity Knime offers as the Swiss army knife it is. Contrary, the HUB is awesome by itself but lacks the necessary level of organization / structure like the example workflows offer.
With the solution diversity and level of automation available “data-drag-races” of a single solution but via default vs. parallelization vs. streaming could provide optimal feedback which solution is best. It’s almost like the data challenges but with the goal to progressively create a solution repository with aforementioned benefits.
To realize that idea I used the Test Data generator as I thought the Knime Core team uses it for regression testing. Anyways, thanks for your time and feedback.