Find rows between two date & time

Hi Knimers,

I have two tables with data.

Table 1 contains:

  • Login time (2023-12-11T00:31:21)
  • Logout time (2023-12-11T00:39:23)
  • IP Address (8.8.8.8)

Table 2 contains:

  • Transaction details (many fields like from account, to account, channel etc)
  • Transaction date & time until minutes (2023-12-11T00:33)

I want to match these two tables and get the IP Address from the Table 1 to Table 2 to identify the Transaction IP address.

Any suggestions would be highly appreciated.

Hello @kamtaot ,

Can you provide sample data ?

1 Like

@tqAkshay95, thank you for responding. Actually, I gave the dummy data within brackets (one record each). Is that not enough? If you want it in a tabular format, I have to do lot of sanitizations before providing the dummy data. Let me know if you still expect to see more than one record.

@kamtaot above data is not capable of mapping these two table, So need clear columns name and data.

1 Like

Hi @kamtot,

As @tqAkshay95 says, having some actual data is necessary if somebody is going to provide an actual workflow demo to assist with this. Proving a workflow involving matches works for a variety of scenarios requires more than a single row of data in both tables!

It doesn’t have to be real data that has been sanitized - you can just make some up. Otherwise somebody else who wants to assist by providing an actual workflow will have to write some from scratch, and being forced to use Table Creator is one of the worst aspects of answering any forum question :wink:

That all said, there isn’t a built in joiner node that can do “match between”. Typical solutions to this problem would involve using cross join to join everything in table 1 to table 2 and then use Rule Based Row Filter to remove all the rows where the transaction time is not between the login and logout times.

There are other approaches. For example, you could like at my “join between” component, which may work depending on your data set:

If your data set is quite large, Join Between may not be sufficiently optimised. In that case you could try the Join Custom Condition - indexed component, which is better performing but requires you write the required join in SQL syntax, but it wouldn’t be that difficult.

2 Likes

@takbb In your scenario " using cross join to join everything in table 1 to table 2 and then use Rule-Based Row Filter to remove all the rows where the transaction time is not between the login and logout times "

but many users can log in and logout at the same time or in between times so the possibility is users can transactions in between so, It is difficult to map the correct transaction to an IP.

1 Like

Hi @tqAkshay95, I guess that’s down to the data.

I have to assume that somewhere in the data there is sufficient information to make the required join and filters to match a Table 2 entry with a Table 1 entry, but if there isn’t then as you say there is no way to map the transaction to the IP address.

@kamtaot - this is an example of why it is necessary to have sample data. What @tqAkshay95 has said is quite right: from the information you have provided, there is no solution unless only a single user connects at any one time.

2 Likes

Apologies @tqAkshay95, due to some emergency, I had to take off and could not respond to you immediately. I will post dummy data in some time.

1 Like

logins.csv (16.9 KB)
transactions.csv (6.6 KB)

I am sharing some dummy data here. Please note that in the transactions.csv the Ref_No is the unique value for each transaction.

Thank you @takbb for your expanded solution. I will try and see if it suits my requirement.

Thank you @tqAkshay95 and @takbb. Yes, the data is related to only one user. In any case, I have provided some dummy data. Please look into it. I am also searching forums if there is any existing solution which I can use.

1 Like

Hello @kamtaot ,

I have achieved the desired result using a Java Snippet node in KNIME.

I’ve attached images for your reference.

Screenshot 2024-01-22 123134

output:

Screenshot 2024-01-22 123233

Hi @tqAkshay95, thank you for the prompt response. Beyond writing “Hello, World!” I do not know how to write Java code. Do you mind sharing the workflow with me which you prepared for this activity? I will try to learn/understand and modify it according to my necessity.

Hello @kamtaot,

Please find the attached workflow for your reference.

Q27.knwf (21.2 KB)

Hi @tqAkshay95 ,

As java snippet itself can only handle a single table, it is reliant on the rows already having been joined in a suitable manner, so that it can perform the required comparisons. In your workflow, whilst the joiner may work under very specific data conditions, I believe it would not work under most circumstances because it can only handle exact joins which we don’t have here. I can’t test your workflow as I don’t have access to your input xlsx files, but I think your java snippet would work, if you used a Cross Joiner instead of a Joiner.

This is what I meant earlier about “using cross join to join everything in table 1 to table 2”. It is unfortunately a necessary step.

@kamtaot , thank you for supplying demo data. The attached workflow demonstrates the suggestions from my previous comment, using your uploaded data.

The first two branches demo the mentioned components. The third branch uses standard nodes.

Join Between Component

Join Custom Condition - indexed Component

Of those two “component” methods, the second requires basic knowledge of sql syntax but with that “power” it gives more options as the conditions can be made more complex. It is also more performant on larger data sets because the component creates indexes on the columns used in the condition. I may at some point get round to trying to add indexes within the other component too, but that’s not there yet!

From your data, it looks like sometimes a row in one table matches to more than one row in the other table, so you may need to include additional filters, for example if there are any other “column matching conditions”, you could add a Rule Engine at the end to include those.

In the “Join Custom Condition” component, you could simply include additional columns in the condition clause.

At the end, you can add column filters (or configure the components) to keep the columns that you want, such as IP Address.

join where one date on table 1 between two dates on table 2.knwf (201.5 KB)

3 Likes

@tqAkshay95 , Like @takbb mentioned, I could not join those two tables as there is no common field between these two tables. However, I sincerely thank you for the efforts.

2 Likes

Hi @takbb , I used the Join Between and it is amazing to see the result. I sincerely appreciate your efforts in pointing me in the right direction.

3 Likes

You’re welcome @kamtaot. Glad to hear it works for you, and thank you for marking the solution.

Thanks @takbb for the clarification.

1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.