QA Checking workflow for duplicates

QA Check Automation.knwf (120.2 KB) I’m trying to setup up a workflow to include some auto-QA checks. Basically to write out a file if certain criteria are true.

I’m starting with a duplicate check, but I’m afraid I may be way overcomplicating things and might need some help thinking this through.

I’m reading in an Excel file, and I need to be sure that two specific columns contain no duplicates. If either of the columns contain a duplicate, I need to write the duplicates out to a file for review, and if there are no duplicates continue through he workflow.

So far in my thought process I have an excel reader split to two Duplicate Row Filters, each followed by a Row Filter set to exclude rows labeled “unique” thereby resulting in a list of “chosen” or “duplicate”. I would then compare the row count of the results with the row count of the excel file, if the row count is different, I need the file written, if not different, go to the next step in the workflow with the full dataset from excel.

I’m attempting to use an If switch to kick off that would then determine which or both of the excel files to run, or neither and then continue with the workflow. However, the output of the End If node is a blank table since all the QA steps are reducing the dataset to check things.

So I have a couple issues:

  1. How do I pass the original dataset to the If output? Or am I just totally on the wrong track?
  2. Running the “End If” node doesn’t trigger the active notes in the “End If” nodes

To my mind pivot-table in Excel looks like more robust tool for the task. Unless the input table is huge.

I appreciate the response. Since I have a lot more downstream nodes, I’m wanting to do this an several other QA pieces within a workflow. But also, yes, the table gets very large, and Excel often chokes. I only included a tiny piece of the data in the example.