I want to preprocess some csv files, and use iterate list of files node to do this work.
My problems are the following:
- In some csv file, there may be missing values, denoted by '?'. On processing such files, variable based file reader works well, while a newly added csv reader will produce an error: "Didn't get any value for column(s) with index......". Is this because csv reader will determine column type by its column values? How do I process the csv files with missing values by csv reader node?
- If I want to process some arff files in the same way (i.e., iteratively), variable based file reader will produce the following error: "Execute failed: Too few data elements......". It seems because variable based file reader cannot process arff format. However, because variable based file reader cannot be deleted even if I use an arff reader node instead, the same error will appear too. Could anybody tell me how to process arff iteratively? And how to delete variable based file reader node?
- I use missing value node to process the missing values. However, I found that missing value cannot process the unknown columns (i.e. ones with totally unknown values '?'), and produces a warning message like this: "Column(s) [......] contain(s) no valid cells", even if I select remove row option (It seems removing column would be a more preferable option). Could anybody tell me how to handle such unknown columns?
- I use Auto-binner to discretize numeric data, and want to scale the bins as new columns and delete the original columns. It seems the option Replace target column(s) would be the choice. However, the option behaves in a perplexed way. Have anybody uses this option sucessfully? In fact, I have tried One2Many node, but it keeps the original columns untouched. Any suggestion？