Reusing Index Query results

I have a workflow that is using a loop based on 'Table Row To Variable Loop Start'. This drives an Index Query followed by some Empty Table Switch logic that either adds the results of the query or an empty row to the output table.

I'm processing over 500,000 rows and the indexed table has a bit over 3,500,000 entries.

The variable that is driving the loop is not unique in the table so if I sort the input table on the variable I will end up with runs of rows where the value of the variable is the same. Under these circumstances is there any way I can reuse the results of the first query for a particular value if it is the same as the previous value instead of doing the Index Query for each and every row?

Doing this will reduce the number of index lookups to 1/5 the current number and could significantly improve performance (but until I try it I wont know for sure).

Any suggestions?

cheers,
Andrew

I am not sure whether I understand your workflow, but what about a Group Loop Start node before your loop?

Best,
Marc

Thanks Marc,

I'll have a look at that, it certainly handles the grouping nicely.

My workflow is enriching firewall log data with geolocation and ASN information for public IP addresses. So potentially large input tables and also very large lookup tables. I'm trying to minimise lookups by reusing the results from a previous lookup if the IP address is the same as the previous one.

This allows me to tag each record in the log which is great for flexibility with future analysis but bad from a performance perspective. The alternative is to leave the enrichment to the last moment for any particular piece of analysis so that I'm working on reduced and summarised data.

New to KNIME so just exploring options.

Throwing it all into a database and doing a nice big JOIN might actually be the best way to do it. I can also do the geolocation in R which is reasonably quick, it's the ASN enrichment that is the real issue at present.

cheers,
Andrew