database reader does not read all rows (without an error)

Dear all,

I created a database table that has ~1.3M rows:

SELECT count(*) FROM IDN_RA_ASSOC
returns: 1345777

When I read all rows with a database reader node we get

SELECT * FROM IDN_RA_ASSOC
returns 199540 rows without an error.

Unfortunately this went undetected for some time…

Table details (Oracle DDL):
CREATE TABLE “KNIMEENT”.“IDN_RA_ASSOC”
( “association ID” NUMBER(*,0),
“substance identifiers (first)” VARCHAR2(255 BYTE),
“substance identifiers (second)” VARCHAR2(255 BYTE),
“association type” VARCHAR2(255 BYTE),
“association details” CLOB
) SEGMENT CREATION IMMEDIATE
PCTFREE 10 PCTUSED 40 INITRANS 1 MAXTRANS 255
NOCOMPRESS LOGGING
STORAGE(INITIAL 1048576 NEXT 1048576 MINEXTENTS 1 MAXEXTENTS 2147483645
PCTINCREASE 0 FREELISTS 1 FREELIST GROUPS 1
BUFFER_POOL DEFAULT FLASH_CACHE DEFAULT CELL_FLASH_CACHE DEFAULT)
TABLESPACE “DATA”
LOB (“association details”) STORE AS BASICFILE (
TABLESPACE “DATA” ENABLE STORAGE IN ROW CHUNK 8192 RETENTION
NOCACHE LOGGING
STORAGE(INITIAL 1048576 NEXT 1048576 MINEXTENTS 1 MAXEXTENTS 2147483645
PCTINCREASE 0 FREELISTS 1 FREELIST GROUPS 1
BUFFER_POOL DEFAULT FLASH_CACHE DEFAULT CELL_FLASH_CACHE DEFAULT)) ;

Knime environment details
KNIME Analytics Platform 3.6.0.v201807100940 org.knime.product.desktop null

===== cat /etc/*-release =====
NAME=“Red Hat Enterprise Linux Server”
VERSION=“7.5 (Maipo)”

Regards,

Panos

Hi there!

When you run SELECT count(*) FROM IDN_RA_ASSOC in KNIME Database Reader node what do you get?

Br,
Ivan

We get 199540 rows

So KNIME reads everything he can see :slight_smile:
You do not use schema name before. Is IDN_RA_ASSOC a synonim or?
Are you connecting to the right database? Know this is a bit stupid question but just maybe…
Maybe try checking those basic assumptions :slight_smile:
Br,
Ivan

Hi Ivan,

I misread your question. The knime reader gives the following:

  • select count(*) FROM IDN_RA_ASSOC returns the number 1345777
  • select * FROM IDN_RA_ASSOC returns 199540 rows

Executing the same queries in SQL Developer gives the correct results.

In my view the knime reader node has a bug. The worst thing is that it does not fail, but it seems to work and it provides incomplete data.

Regards,

Panos

1 Like

p.karamertzanis, you most likely populated the table before you read it second time. to have correct result you need to restart KNIME. After December 6 it will be fixed.

Thanks Izaychik63,

Knime has been restarted several times after the table was last updated. Still the problem persists.

Is this a known knime bug?

Unfortunately this led to wrong business decisions already, so it is good to know where we stand.

Many thanks,

Panos

You can try to install KNIME Database (Preview) and use Database modules under KNIME Labs. Also, I would try to update statistics on Oracle table.

I managed to use the KNIME Labs database module.

At around row ~199000 it gives an exception, so something wrong is going that I will try to investigate. The question remains though why the Database Reader node completes successfully. This needs to be looked at in my view.


ERROR DB Query Reader 7:5766 Exception reading data row: java.sql.SQLException: Protocol violation: [ 14, 34, ]
DEBUG DB Query Reader 7:5766 Exception reading data row: java.sql.SQLException: Protocol violation: [ 14, 34, ]
org.knime.core.data.convert.map.MappingException: java.sql.SQLException: Protocol violation: [ 14, 34, ]
at org.knime.database.datatype.mapping.DBCellValueProducerFactory.lambda$0(DBCellValueProducerFactory.java:79)
at org.knime.core.data.convert.map.MappingFramework.map(MappingFramework.java:214)
at org.knime.database.agent.reader.impl.DefaultDBRowOutput.next(DefaultDBRowOutput.java:136)
at org.knime.database.agent.reader.impl.DefaultDBRowOutput.next(DefaultDBRowOutput.java:1)
at org.knime.database.node.io.reader.query.DBQueryReaderNodeModel.lambda$0(DBQueryReaderNodeModel.java:119)
at org.knime.database.agent.reader.impl.DefaultDBReader.readTable(DefaultDBReader.java:80)
at org.knime.database.node.io.reader.query.DBQueryReaderNodeModel.execute(DBQueryReaderNodeModel.java:129)
at org.knime.core.node.NodeModel.executeModel(NodeModel.java:567)
at org.knime.core.node.Node.invokeFullyNodeModelExecute(Node.java:1173)
at org.knime.core.node.Node.execute(Node.java:960)
at org.knime.core.node.workflow.NativeNodeContainer.performExecuteNode(NativeNodeContainer.java:561)
at org.knime.core.node.exec.LocalNodeExecutionJob.mainExecute(LocalNodeExecutionJob.java:95)
at org.knime.core.node.workflow.NodeExecutionJob.internalRun(NodeExecutionJob.java:179)
at org.knime.core.node.workflow.NodeExecutionJob.run(NodeExecutionJob.java:110)
at org.knime.core.util.ThreadUtils$RunnableWithContextImpl.runWithContext(ThreadUtils.java:328)
at org.knime.core.util.ThreadUtils$RunnableWithContext.run(ThreadUtils.java:204)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at org.knime.core.util.ThreadPool$MyFuture.run(ThreadPool.java:123)
at org.knime.core.util.ThreadPool$Worker.run(ThreadPool.java:246)
Caused by: java.sql.SQLException: Protocol violation: [ 14, 34, ]
at oracle.jdbc.driver.T4CTTIfun.receive(T4CTTIfun.java:669)
at oracle.jdbc.driver.T4CTTIfun.doRPC(T4CTTIfun.java:249)
at oracle.jdbc.driver.T4C8TTIClob.read(T4C8TTIClob.java:245)
at oracle.jdbc.driver.T4CConnection.getChars(T4CConnection.java:3901)
at oracle.sql.CLOB.getChars(CLOB.java:517)
at oracle.sql.CLOB.getSubString(CLOB.java:354)
at org.knime.database.DatabasePlugin.lambda$18(DatabasePlugin.java:216)

It is likely that the clob field in one row contains “invalid” characters. The Database Reader is probably not handling all exceptions correctly.

It could be also JDBC issue.

It could be indeed.

Cool troubleshooting :slight_smile:
Did you come up with something? Identify a row or character?
Ivan

I managed to fix the issue by creating JSON fields (stored as CLOBs) in GSON (gson.toJson()). The one that failed was looking ok, so it could be that it contained an invalid (not visible?) character but I had no chance to investigate further.

The blocking factor is that the database reader node of knime may execute (becomes green) although not all rows have been read. This can cause serious issues if it goes undetected.

1 Like

Seems like a problem but new database integration nodes did report an error :wink:

Br,
Ivan

1 Like