how to avoid duplicate value when appending in csv writer

mehrdad_bgh · September 26, 2020, 5:03pm

hi all
i have a excel data with 205 rows and split it in 2 part:
1.201 rows
2.5 rows
now when i write them separately with “csv writer” my row number 201 get duplicated
btw i used append in csv writer to combine them and i don’t want to use concatenate or duplicate row filter node

knimediger · September 26, 2020, 7:19pm

@mehrdad_bgh

Welcome to KNIME.
The https://nodepit.com/node/org.knime.base.node.preproc.duplicates.DuplicateRowFilterNodeFactory might help you to solve your issue.

mehrdad_bgh · September 27, 2020, 2:19am

tnx for answering my problem but as i said i don’t want to use concatenate or duplicate row filter
is there any setting in csv writer to do this task?

mlauber71 · September 27, 2020, 6:44am

CSV writer would just append whatever data you sent. If you want to avoid duplicates you would have to do it before.

If you want to refer to an ID you would have to read this back from the existing CSV and increase the number so it would start at the next possible value. Or you could use a database with primary keys that would reject any new entry violating the rule.

Maybe you could give us an example of what you want to do.

From my understand this would be 206 rows so there already would be one duplicate.

mehrdad_bgh · September 27, 2020, 8:13am

thank you @mlauber71
first: i wanna read a 205 rows XLS file with file reader ,put first 201 rows limit on it and then convert it with csv writer.
second: read exact XLS file with file reader, put last 5 rows limit on it and convert it with csv writer and save it as same name as first one
the row count of XLS file and csv file should be equal
but i faced duplicated issue and don’t want to use any additional nodes if its possible

mlauber71 · September 27, 2020, 10:01am

I am not exactly sure what this is good for and why you want to do it but this is also possible. You would have to specify the range of the data you want to import in the Excel Reader node correctly.

kn_forum_227656_avoid_duplicate_csv_excel.knwf (53.0 KB)

mehrdad_bgh · September 27, 2020, 10:18am

@mlauber71
i did use that for split

and this task is for homework:)
thank you agin

Daniel_Weikert · September 27, 2020, 10:41am

You read the 202 twice don’t you?

mehrdad_bgh · September 27, 2020, 1:04pm

yess according to the task i have to do this

armingrudd · September 27, 2020, 3:22pm

Hey Mehrdad,

As explained and demonstrated by @mlauber71:
When you are asked to read the first 201 rows of an Excel file and if column names exist in the file, you enter 201 as the last row and you know that the first row contains headers so the imported table contains 200 rows with headers from the Excel file. Now when you read the rest of the rows or the next 5 rows, e.g. 202 to 206, and you still need to read headers, you enter the row numbers and read headers in row 1 so this time your imported table contains 5 rows as expected.

system · October 4, 2020, 3:22pm

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.