I often find myself wanting to take a column of IDs and essentially loop-over these to bring back data from various databases.
This can be done by first connecting to the database, then using a 'TableRow to variable loop start' node to feed the IDs to a 'Database Query' node:
However, for large lists it is extremely inefficient to run multiple queries in this fashion, and what is much better is to pass a list of IDs in chunks of up to 1000. The query would then be changed to:
- GroupBy node - group on the column of interest, to make sure we are not being inefficient and searching multiple times for the same ID
- Chunk Loop Start node - to split the input into 1000-row chunks. For Oracle at least, this avoids the ORA-01795 error for lists > 1000 members
- JPython Script node - this loops over the list of IDs and generates a single-quoted, comma separated string
- TableRow to Variable node - pass the list into the Database Query node as a variable
What would be really nice is if the Database Query node essentially handled this automatically - or perhaps better to have a new Database Column Query node that takes a table as input as well as the database connection (and then allows selection of the query column and handles the chunking and list generation behind the scenes)?