Data Files validation based on the Date range.

Hello Team,

I am looking for a solution to address the following scenario and would appreciate your kind support.

  • Page 1 contains rows with unique IDs and their corresponding Start Date and End Date (each unique ID appears only once).

  • Page 2 contains the same unique IDs, but each ID is split into multiple rows based on different Date values.

Example:

  • On Page 1, a unique ID (say ID = 1) has the following range:

    • Start Date: 01/08/2025

    • End Date: 01/10/2025

  • On Page 2, the same unique ID appears as:

    • Row 1: 01/08/2025

    • Row 2: 20/09/2025

    • Row 3: 01/10/2025
      Regards

    • Madhusudhan j

You’ve described the format of the data. Could you explain what you want to do?

2 Likes

i want to compare both sheets based on unique id matching date range , when page 2 dates of multiple rows is falling between start date and end of page 1 , then i should get result True else false.

So in your example data the start and end dates would be false and the middle date(s) would be true?

greater or equal to start date and less than equal to end date should be True , else false.

This should work. You’ll need to have the dates configured as actual dates not strings.

Date Range Flag.knwf (69.5 KB)

1 Like

Thank you very much for your kind support so far. However, I’m facing one additional challenge and would like your guidance.

Here’s the scenario:

  • Page 1 contains one row per unique ID, where the unique ID is a combination of Employee Number and Type_Wage.

    • Example:

      • Unique ID = Employee Number + Type_Wage

      • Start Date = 10-Aug-2025

      • End Date = 15-Oct-2025

  • Page 2 contains multiple rows for the same unique ID (e.g., 4 rows), with different individual dates under a single date column:

    • Example:

      • Unique ID = same (Employee Number + Type_Wage)

      • Dates = 11-Aug-2025, 02-Sep-2025, 01-Aug-2025, and 11-Oct-2025

Validation logic required:

  1. Match the unique IDs between Page 1 and Page 2.

  2. Check whether each date from Page 2 falls within the Start Date and End Date range from Page 1 (i.e., Start Date ≤ Page 2 Date ≤ End Date).

  3. If it falls within the range, mark as “Match”; otherwise, mark as “Mismatch.”

In the above example, three dates match, but 01-Aug-2025 is a mismatch.

Please note that the data will be read directly from the Excel Reader, not from a manually created table.
For knime while comparing multiple rows of page 2 with empty rows of page 1 gives u lot of mismatch even though page 2 rows are correct as per Dates logics.

Kindly assist n how this validation can be implemented.
Regards

Madhusudhan J

there are a plethora of ways how to tackle your very simple problem. some are more verbose and others are more efficient.

instead of just asking for a solution, why don’t you provide an upload of your progress and explain what your issue is in that step?

(alternatively, if you are using LLMs to generate texts, you may be better of just asking it for a python solution straight away and use that (either skipping knime or dumping it into a knime python node))

Validation Sample Data.xlsx (11.3 KB)
Here is the sample date validation file , i need similar way of knime solution.
First check date range in page 2


Then check and loop Page1 as similar like below , match ID first then check if the Dates on Date column is between the Dates range of page 2 Start and End dates (Date column dates >= Start date of page and <= to End date of Page2 date) . The give result match for dates matching rows else mismatch as similar as below, i have thousands of rows similar pattern.


Please assist.

Your sample data was incomplete. I added the second ID.

Date Range Flag rev 1.knwf (93.6 KB)

1 Like

Thank you once again.
Just a small clarification — the ID has been added in both sheets (Column D in Page 1 and Column E in Page 2). Also, since Page 2 contains only one row, when you join the two sheets using the Joiner node, you would ideally get empty rows for the Page 2 entries during comparison. As a result, mismatches are expected when the data is compared against those empty rows.
So ideally i should get match result like below , even though comparing column i mean page 2 empty but Page 1 row value is matching and falling in between the start and end dates like below.


Example attached below,

I’m thoroughly confused by your latest output table. The first row in effect joins two different IDs. Do you only want to check the date ranges and ignore the IDs?

Data Format is like that , we have only one row item in page 2 for each unique ids which consists Start and End date.

Where as page 1 consists multiple row items for each unique ids of page 2 which consists only one date column.

Basically it is an absence report, we need to check first unique ids of page 1 is matching with page 2 unique ids , and then check date column of page one is >= Start date <= end date of matching unique ids in page 2 , then i should get result as match in page 1 for all Matching row else mismatch. As I said challenge is empty rows comparing after joining two files. Due to one unique row items for each unique ids

Screenshots don’t help. Could you provide a data file with an example of page 1, page 2 and the expected output?

1 Like

You can just copy paste the page two date next to page 1 table and compare row wise. I have attached excel file as well for your kind reference

There’s no attachment. If you want help on the Forum its best to provide data rather than making potential helpers do extra work with cutting and pasting. Remember we’re volunteering our personal time.

2 Likes

Validation Sample Data.xlsx (13.7 KB)
please find the attached sample report

Page 1 in your last upload appears to be the desired output. I need the original input.

Validation Sample Data.xlsx (11.3 KB), here is the sample report.

I’m still confused. What are you joining on to get this output with two different employee IDs in the same row?