I am experiencing an issue with the “PostgreSQL Connector” and “Generic S3 Connector” used to connect to PostgreSQL and MINIO automatically disconnecting due to connection time issues.
In my workflow, I spend a lot of time modeling with PySpark Script and export the output to PostgreSQL and MINIO. As the analysis takes a long time, the connection session between PostgreSQL and MINIO is terminated and the Workflow fails, is there any way to resolve this?
The error is as follows
There are messages for workflow “1011_spark_churn_server_test 2023-10-13 20.03.26”
PostgreSQL Connector 14:2610:0:2601 - WARNING: DB Connection no longer available. Go to advanced settings to enable connection restoration.
PostgreSQL Connector 14:2610:0:2602 - WARNING: DB Connection no longer available. Go to advanced settings to enable connection restoration.
Container Input (Variable) 14:2057 - WARNING: Default variables are output
Container Input (Variable) 14:2085 - WARNING: Default variables are output
Generic S3 Connector 14:2610:0:2600 - WARNING: S3 connection no longer available. Please re-execute the node.
Hi,
So far I have only seen this error when the workflow was closed and then reopened, not during a single run of a workflow. I assume for you this happens without closing the workflow in-between? Have you considered establishing the database connection (or an additional one) after the analysis, so it is not sitting idle for so long?
Kind regards,
Alexander
The workflow is running on the server, so we don’t do anything to open it in the middle and then close it again.
It accesses the database from the time it reads the data before analysis, and the database remains open during the long analysis time. I don’t know how many hours the analysis will take, so I don’t think it’s possible to time out at a specific time.
Is there a setting in the Postgre, Minio Connector node options window to keep the DB open for a long time?
Hi,
Have you maybe uploaded the workflow to the server with the database or S3 connectors being green, i.e. not reset? Because this is when you usually get that error message. You run the workflow locally, save it in executed state, upload it to the server, connector is green, but of course the DB connection you had locally is lost now.
Kind regards,
Alexander
In my local environment, all nodes are yellow when I save and upload to the server. The reason why we keep the database open for a long analysis time is because we export the output from the analysis to the open database.
So it is difficult to disconnect and reconnect this node in the middle.
Hi,
I am not aware of any timeout on our side, but that two very different connections, namely DB and S3 exhibit this behaviour, would probably mean that its not the backend that is causing the issues. Maybe our DB dev @tobias.koetter has an idea.
Kind regards,
Alexander
Hello @JaeHwanChoi ,
as Alexander already mentioned we do not set any timeouts. Do you have a KNIME log file which contains more information e.g. the DB framework validates that the session is valid prior executing a statement. If it is not valid this should be visible in the KNIME log. For the Postgres problem you could enable automatic reconnecting by setting the Automatically reconnect to database flag in the Advanced tab of the PostgreSQL Connector node.
In general the Postgres driver has several timeout parameters that can be set via the JDBC Parameters tab but I don’t think that any is applicable for a session that idles for a long time. There are also settings in Postgres that cause the database to close idle session after a certain time as described here.
I’m not to familiar with MINIO but maybe there also exist idle session timeout parameters.
Bye
Tobias