For a certain column the first 2 files of the list have only “0” , and KNIME automaticly assign type Integer to the column,
but for other files you can have either 0 or floats (0.1 , 0.425 … ).
So I get an error :
Execute failed: The column 'Dist min EdgeA-EdgeB' can't be converted to the configured data type 'Number (integer)'. Change the target type or uncheck the enforce types option to map it to a compatible type.
So I tried to enforce use of “Standard Double” (or Full Precision)
which leads to the “opposite error” :
Execute failed: The column 'Dist min EdgeA-EdgeB' can't be converted to the configured data type 'Number (double)'. Change the target type or uncheck the enforce types option to map it to a compatible type.
So I change configuration to remove “Enforce types”
Execute failed: Input table's structure differs from reference (first iteration) table: Column 2 [Dist min EdgeA-EdgeB (Number (double))] vs. [Dist min EdgeA-EdgeB (Number (integer))]
Finally I tried to rename the third file (with float in the column of interest) so it’s processed first, and thus enforce creation with a Double , and I get the error :
Execute failed: The column 'Dist min EdgeA-EdgeB' can't be converted to the configured data type 'Number (double)'. Change the target type or uncheck the enforce types option to map it to a compatible type.
As @elsamuel said, allowing for changing schemas should work with the sample data that you have supplied.
A couple of extra notes that may be of use. Firstly there is no need to put the CSV Reader in a loop to perform this operation. You can now tell it to scan a folder for all the required CSV files, which reduces the nodes used.
Secondly, you can hit problems with “changing schema specifications” if a file is sufficiently large and does not consistently contain the same data types on all rows (e.g. some rows are clearly doubles, and some rows contain just integers). The reason this might cause problems is that the checking of specification scans only so many characters of the file. If it is large enough, the type-checker could mis-interpret the specification.
For this reason, it can sometimes be desirable to set all of the problematic column transformations (within the Transformation tab of CSV Reader) to String, and then define the column types afterwards. You could use the “Column Auto Type Cast” node after you have read the files, but this also presents a potential issue. If there is inconsistency in data types across different executions of your workflow (i.e. the files change between executions and the data types are inconsistent) then on one run, the type casting may set the column types differently to the way it would set them on a subsequent run. This might (or might not!) cause problems with your downstream flow depending on what you are subsequently doing with the data.
To avoid that potential for inconsistency between executions, I assembled a component “Redefine Table Column Types” which is available on the hub Redefine Table Column Types – KNIME Hub.
It requires a little extra effort in the form of supplying a data table with the required data types as a sample row, but might be useful if this is likely to be an issue.
I have attached a workflow as an example of the above
Hi, there is a tick box on Advanced Settings “Append path column” where you give it a column name (default being “Path”).
There are other ways of doing it. For example, if you are using the loop construct, you could use “Path to String (Variable)” to copy the loop “path” variable to a String variable and then use a “Variable To Table Column” node to create a column from that String variable on each iteration.