DB Writer obviously ignores Remove Existing Table option

I’ve just experienced DB Writer Node obviously ignoring Remove Existing Table option checked in a workflow.

The workflow walks over a list of tables to be copied from a PostgreSQL database to an MS SQL Database in a loop. In the first loop iteration, the DB Writer failed with a Null Pointer Exception. There was a “Execute failed: (“NullPointerException”): data” entry in my log. I noticed the table was created and filled with the data in MS SQL Database (The Number of rows was as expected too). I unchecked “Append write status column” option as an attempt to solve the issue (I learnt subsequently this didn’t help) and re-executed the node. Now I got “Execute failed: There is already an object named ‘sat_tekstavu’ in the database.” error message despite having “Remove existing table” option checked.

Attaching a console log output. DB Writer obviously ignores Remove Existing Table option.log (197.4 KB)

This is how my workflow looks like:

And here I’m attaching an export of my workflow: DWH to MS SQL.knwf (515.9 KB)

The reason might have been that the DB user wasn’t a member of the db_datareader role. I’ve figured this out after replacing DB nodes with those legacy ones. Since that, I could see a failed attempt to check the table’s presence by invoking select * from xxx where 1 = 0. The select ended up with a permission error. KNIME likely interpreted the error as a sign the table doesn’t exist and conducted an attempt to create the table again, which definitely failed.
I guess KNIME could get more aware of the status, the select it uses to determine whether the table exists, ended with.
The relevant part of the log the KNIME produced after I had introduced legacy DB nodes, was as follows:

2020-01-15 15:38:56,711 DEBUG Database Writer (legacy) 0:72       Execute query: SELECT * FROM (SELECT 1 as tmpcol FROM dwh_fis.sat_year_month) tempTable_994273208467322907 WHERE (1 = 0)
2020-01-15 15:38:56,723 DEBUG Database Writer (legacy) 0:72       Got exception while checking for existence of table 'dwh_fis.sat_year_month': The SELECT permission was denied on the object 'sat_year_month', database 'ads_dwh', schema 'dwh_fis'.
2020-01-15 15:38:56,724 INFO  Database Writer (legacy) 0:72       Table "dwh_fis.sat_year_month" does not exist in database, will create new table.
2020-01-15 15:38:56,725 DEBUG Database Writer (legacy) 0:72       Executing SQL statement as execute: CREATE TABLE dwh_fis.sat_year_month (hashkey varchar(255), year integer, month integer)
2020-01-15 15:38:56,736 ERROR Database Writer (legacy) 0:72       Execute failed: There is already an object named 'sat_year_month' in the database.
2020-01-15 15:38:56,737 DEBUG Database Writer (legacy) 0:72       Database Writer (legacy) 0:72 doBeforePostExecution
2020-01-15 15:38:56,737 DEBUG Database Writer (legacy) 0:72       Database Writer (legacy) 0:72 has new state: POSTEXECUTE
2020-01-15 15:38:56,738 DEBUG Database Writer (legacy) 0:72       Database Writer (legacy) 0:72 doAfterExecute - failure
2020-01-15 15:38:56,738 DEBUG Database Writer (legacy) 0:72       reset

I also sometimes had problems with the automatic removal of existing tables, especially on big data systems. I tend to do a separate command in an SQL executor node that does drop the table - which admittedly is not very elegant.

DROP TABLE IF EXISTS my_schema.my_table;

Hello jan_lender,
thanks for reporting the problem. I have opened a ticket to make the table exists check more robust.
I had a look at the log file but couldn’t find any reason for the NullPointerException in the first iteration. Can you please have a look in the KNIME log file if you can find any entries with “NullPointerException” and send this part ideally via private message if it contains sensitive information. I will send you a private message so that you only need to reply there.
Regarding mlauber71’s suggestion, if you are not fond of writing SQL you can also use the DB Table Remover node prior the DB Writer node to execute the DROP TABLE IF EXISTS command.
Bye
Tobias

5 Likes

Hi, @tobias.koetter,
thank you for your reply.

I’m pretty sure the NPE was thrown because of missing read privilege at the DB. It happened at the end of the node execution after the data was written to the database (I checked in the DB it was really there).

I’m gonna look for the log. I will re-simulate the issue if I fail.

Regards,
Jan

Making the delete command more robust is certainly a good idea. I have experienced problems with DB nodes and removal of tables on Cloudera connections (Hive/Impala) when the KNIME instance was not able to reliably delete an existing table due to missing statistics it seemed. I solved this problem by:

  • sending a DROP TABLE IF EXISTS command (just in case)
  • deliberately creating a dummy table of the same name
  • creating statistics for this table (so the workflow from KNIME or the instace does know that the table is there)
  • again DROP this now savely existing table
  • create my new table

And yes I know how that sound but it was the savest method. Especially if you want to reliably have such workflows in production with KNIME server.

2 Likes

This topic was automatically closed 182 days after the last reply. New replies are no longer allowed.