I need to handle a CSV file in Knime and I would appreciate help about which nodes I should use in the following situations:
1- Rearrange the order of the columns.
2- Rename the header of the columns.
3- Add new columns (these new ones will be “blank”, but need to be there in order to provide the right structure/sequence of columns in the expected output CSV).
4- Split the table into two or more new tables based in the content of the rows of the column “category” (e.g. finance, process, law). So, each new table is basically a filtered table from the big table generate in step 3 (above).
5- Export the tables to new CSV files to be handled outside Knime.
Thank you very much,
1 - I believe you are looking for Column Resorter,
2 - Column Rename
3 - many nodes can do this, the easiest would be Rule Engine (something default value) or one of the Java Snippet nodes (return null or empty string)
4 - I would use a loop, most probably the Group Loop Start and also do 5 within the loop using the CSV Writer, but you have not specified the naming convention for the file names, so it might be a bad fit for your usecase.
Hope this helps, gabor
Use a CSV Reader or File Reader node.
1. Column Resorter.
2. Column Rename.
3. Rule Engine node. Choose default label as an empty string in the node config.
4. A number of row filter nodes all connected to the rule engine node, can be used to filter for each category is the easiest way for beginners as a one of situation, by putting each string into the column value matcher box. If you want to automate this part for future times, you can use the Group Loop start and Loop End nodes using the grouping column which contains your classes (in which point 5 will need to be inside the loop nodes and you will need to use flow variables to set up automated output names - as I say this bit is not for beginners)
5. CSV Writer node.
i hope this helps,
Thanks a lot for the replies. Based on that I could create the workflow, obtaining the outputs I was expecting.
Can a single node like the R snippet give multiple outputs to corresponding CSV Writer nodes?
For example, output 1 from R snippet goes to CSV Writer 1 . output 2 goes to CSV Writer 2(from same snippet), etc...
Please find a screenshot of my workflow and let me know how to use these nodes. I would like to achieve the same output as above. Splitting a table into several tables and save it different files based on different values in my outcome column. I see that csv needs to be put within the loop but am not sure how since csv doesnt have output node.
I have 20 categories in a column to be separated into 20 files from a single table so row filter doesnt seem like an ideal way to do this. However, using group loop start and loop end nodes it looks like being achievable but am not sure how.
I am not sure how to insert csv node within this loop. Please find screenshot of my workflow and help me out in doing this task. Thanks.