Database 'column' query

Dear All,

I often find myself wanting to take a column of IDs and essentially loop-over these to bring back data from various databases.

This can be done by first connecting to the database, then using a 'TableRow to variable loop start' node to feed the IDs to a 'Database Query' node:

 

SELECT * FROM #table# t1
 
WHERE
 
t1."ID" = '$${SID}$$'

 

However, for large lists it is extremely inefficient to run multiple queries in this fashion, and what is much better is to pass a list of IDs in chunks of up to 1000.  The query would then be changed to:

 

SELECT * FROM #table# t1
 
WHERE
 
t1."ID" IN ($${SList}$$)
 
Where 'List' is a string flow variable in the form 'ID1', 'ID2', 'ID3'.  I currently construct this in the following way:
  1. GroupBy node - group on the column of interest, to make sure we are not being inefficient and searching multiple times for the same ID
  2. Chunk Loop Start node - to split the input into 1000-row chunks.  For Oracle at least, this avoids the ORA-01795 error for lists > 1000 members
  3. JPython Script node - this loops over the list of IDs and generates a single-quoted, comma separated string
  4. TableRow to Variable node - pass the list into the Database Query node as a variable

What would be really nice is if the Database Query node essentially handled this automatically - or perhaps better to have a new Database Column Query node that takes a table as input as well as the database connection (and then allows selection of the query column and handles the chunking and list generation behind the scenes)?

Kind regards
 
James 

Very good solution to overcome the problem of looping over multiple values using the Database nodes. I know that the Database Looper is not a solution for you since you want to loop int values instead of strings. I guess providing this functionality within the Database Looper would also solve your problem and you won't need to loop construct anymore.

Passing over any scripting node of course, you can also do step 3 by using the group by node and choosing to aggregate by concatenate. This will give the data in a cell, comma separated!
Simon.