Filter (exclude) rows that are already in database

I do some stuff and then I want for each row to query the database and then exclude all rows from the workflow that are already in the database.

Notes:

  • I can not use Database looping as the actual query is a chemical structure query and not an IN() query

I created a table row to variable loop start and use the variable for a database reader and then convert that value to a variable used in Row Filter node. However this leads to an empty table error when no record is found in database. Using the empty table switch I get a different issue, namley in the loop end:

Variable Loop End Execute failed: Input table’s structure differs from reference (first iteration) table: different column counts 7 vs. 8

So how can I sanely do this? I mean this seem to me a very basic thing to do yet knime seems to make these overly complex…

Hi,

the loop end node has problems because with now row in the empty branch it does not know how to collect the data. Your solution with the empty table switch works if you insert in the empty branch an Add Empty Rows node followed by a Column Renamer Node. The column names need to be identical in both branches. Then the loop end node can collect the results correctly. Finally you have to filter out the empty table rows.

Best regards

Jerry 

thanks. My solution was to change the SQL statement so that it always returns a row:

SELECT (
SELECT
my_column
FROM
my_table
WHERE
id = $$id$$
) AS my_column FROM dual

this is for oracle. for other databases the from dual can be omitted.

I then use the missing value node to generate an empty string which in my case will do nothing in the subsequent workflow.