I hope I can accurately explain what I want to do.
1- There is a fixed excel file for example a.xls. Continuous data is added to the excel file every day. And with the data added every day, the number of records is progressing. (no deletions)
- But for example, there are newly added data today, a.xls data added
- Rule 1. If the previous reference column data comparison is made by providing repeated record control. No recurring entries
- Rule 2. Writing new data that does not repeat to a.xls file (without deleting the data in the current file)
2- Determining new data added to a.xls and writing to b.xls file that will be daily report (this file is updated every day. Old data is not kept)
an example of this would be great for me. And I hope I was able to tell you correctly.
Many thanks for every kind of opinion and help
Note: The sample workflow will be educational for me. It will be great
Try to do something with https://nodepit.com/node/org.knime.base.node.preproc.duplicates.DuplicateRowFilterNodeFactory
This node will tell you which rows are unique (b.xls, new data) and chosen (a.xls, old data) and duplicate (rows to delete),
I made a draft workflow. But all I need is to compare the new records with the archive file.
then just adding new records to the archive file and adding them to the daily report file
how can i just export new records?
The solution thus reads an archive file, where data is transferred every day. New incoming data is checked before being transferred to archive file and daily report file, and parsed. Finally, new unique records are transferred to both the archive file and the report file.
Concatenate the archive and new data use duplicate row filter which will find new data (node ads classification column)
Select “keep duplicate rows” and then filer unique
Update: I share the latest version of the workflow for those who are looking for a solution in a similar subject.
as a result ;
1- archive file is read. If there is a record to be added, it will be updated by adding it to the archive file. So excel new record is added. (does not actually add archive + new record = new archive. combines and rewrites by comparing)
2- New records that are not in the file of archive are compared with the url address and generate reports for new records.
This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.