filter rows based on date - week

Hi Team,

I have dates in two columns. 1st column can have multiple month & years. 2nd column has only previous month dates. I need to filter rows based on the below criteria.

if 2nd column date is on Sunday, then i need either Sunday, Saturday or Friday dates from 1st column (if all Sunday, Saturday and Friday are available, then i need all the 3)
For example: 2nd column date 2019-03-03 (Sunday) then i need either 2019-03-03 or 2019-03-02 or 2019-03-01 or all of the above if available.
else
2nd column date = 1st column date.

To achieve this, I extracted year and month from both the columns using 2 separate string manipulation nodes and then rule based row filter (i could have used date & time extractor, but it creates extra 2 columns) to filter previous month dates from 1st column.

Then i extracted day of the week(number) from both the columns and used rule-based row filter, but getting the below error. I think it is because of data type issue.

i tried with rule engine, column expression, but could not find the required output.

Attached is the dates excel file for your reference.

Could someone assist on this please.

Date Filters.xlsx (37.5 KB)

Rule-base filter does not allow arithmetic operations, only logical one.

1 Like

Hi there!

don’t quite understand what are you trying to accomplish - only filtering or filtering + changing values? What do you do in case when 2nd column date is not Sunday?

Maybe easiest would be if you can provide small input data and expected output.

Br,
Ivan

Hi Team,

Finally i was able to get the desired output with some struggle (the taste of success after struggle is sweet…:slight_smile: ) very happy for that. I have done this using VBA(macros), but wanted to do it in KNIME.

Here are the steps i followed.

  1. I filtered previous month dates from 1st column by using substr function in String Manipulation node for both the columns.
  2. Find the no of days between the 2 columns using date&time difference (days).
  3. Filtered row only that are greater than or equal to -2 using rule based row filter.
  4. Extracted day from 2nd column which is sunday.
  5. Excluded rows day not equal to Sunday and no of days not equal to zero.

I tried all these conditions/criteria’s both within loop and out side loop, i found the time is identical for both (2 mins).

I would like to know that, is there any optimized way of doing the same.

Once again a big thanks to KNIME for providing the opportunity to learn ETL and Analytics.

Regards,
Pavan.

3 Likes

Hi Ivan,

I have attached the sample data in my first post.

I feel you are true leader.

Regards,
Pavan.

1 Like

I would recommend you to look at Column Expression node to combine multiple column manipulation in one node.

Hi @izaychik63,

I did tried with Column Expression node, but some how couldn’t get it right.

This is what i am trying to achieve.

if column 2 is sunday, then i should get array of column 2 value or column 2 value -1 or column 2 value -2,
else
column 1 value.

it would be great if you could assist on this.

Regards,
Pavan.

Column Expression can replace String Manipulation and Math Formula nodes not Row Filter’s nodes.

Then i might need to use multiple nodes to get the required output…little lengthy.

Regards,
Pavan.

Hi Pavan!

I have seen your input file but still missing the big picture. If you can provide smaller input sample (10-20 rows) and desired output with logic explained if not obvious I think your workflow can be optimized.

Br,
Ivan

Hi Ivan,

I need to find the total number of queries worked by the team for the previous month. So for this i have the below columns.

Last Query Raised Date
Today(Reporting Date)

Users work from Monday to Friday and the completed work will be reported the next day stating that they have completed “x” numbers yesterday(mentioning the date).

For example: if Today(Reporting Date) is equal to Last Query Raised Date(the date when the action has been taken), then filter the date. This is a straight forward one.

Today(Reporting Date) Last Query Raised Date
Monday Monday
Tuesday Tuesday
Wednesday Wednesday
Thursday Thursday

The worked completed on Friday will be reported as Sunday, so for that it should be like below

Today(Reporting Date) Last Query Raised Date
Sunday Friday
Saturday
Sunday

This is in brief on the required output and the conditions/logic to be applied.

Regards,
Pavan.

Hi Pavan,

Now I get it! Sometimes I’m a bit slow :smiley: Also I did notice now you have an output column in your Excel which made thing a bit easier :open_mouth:

Here is optimized workflow:

I used GroupBy node to get rid of duplicates and pretty much all other logic is in Rule-base Row Filter node. Data is included in the workflow attached. Take a look and if any questions feel free to ask.
2019_05_08_Special_Row_Filtering.knwf (53.8 KB)

Br,
Ivan

2 Likes

Hi Ivan,

Seems like i need to update KNIME as i see a pop of version conflict while accessing the workflow shared by you. Let me update and check the optimized it.

Regards,
Pavan.

1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.