DB version 4 nodes not working with capitalised column names in database

Hi all,

I’m having a strange issue when I select a table and try to use in-database manipulation nodes on it (primarily group by and column select). It shows the columns in the configuration for the node but when executing throws a table.columnName does not exist error. The error message has the table name in lowercase, whereas its actually uppercase in the database, so I suspect this may be the cause of the issue. Does anyone know of a workaround or solution to this? Thanks in advance.

Hi MPersin,
this is strange. What is the database you are connecting to? In general the DB Table Selector retrieves the proper column names (including their case) from the database and all subsequent nodes use these name and quote them to ensure that the case is also applied.
You can change the quoting behavior of the node via the Advanced Tab of the connector node e.g., via the delimit only identifier with spaces or the Identifier delimiter (closing)/(opening). By default KNIME is quoting all identifier.
Bye
Tobias

1 Like

Hi @MPersin , I do a lot of Knime to DB, and I have never gotten this kind of issue. Knime is very case sensitive, and a lot of DB system is too, and from what you are saying, your DB system seems to be case sensitive.

I know Amazon Snowflake uses uppercase naming convention, and if your tables have uppercase names, you need to make sure your queries, or your Knime table have the same names same cases.

Like I said, never encountered such issues writing to Snowflake with uppercase, and lowercase with Amazon Postgres, and with different cases with other systems such as MSSQL and Mysql, as long as I am using the same names same cases as the system I am working with.

1 Like

Hi Tobias,

I’m connecting to a Postgres database, I’ll attach a screenshot of the nodes of the error message/workflow segment so I can better display the issue. I also did some more experimentation, it seems columns not in block capitals (but sometimes with the first letter capitalised) work fine, its only ones in which the entire name is capitalised that cause issues. I’ve tried instructing it only to delmit only identifiers with spaces, it seems not to have had any effect.

Thanks again,

Matt

Hi Bruno,

Thanks for your response - I do suspect that the database is case sensitive, the issue is that I’m not writing the query, its being made by the DB column filter and DB group by nodes. I’m only guessing that its making the column names lower case by the case given in the error message it throws, I’ve not actually seen the query its generated.

Hi @MPersin , I’ve never used the DB column filter nor the DB Group by nodes.

As an alternative, could you run these as queries via the DB Query node instead? The DB Query will run the queries as you provide them as is, so it should respect the case sensitivity you provide.

2 Likes

Which version of Postgres are you using, and on what platform? I’m not sure it will make any difference, but I might be able to install a copy of the same version of Postgres on my Windows server and have a play to see if I can break it. :slight_smile:

This is mostly just some background but maybe it might flag something…

Most (if not all) relational databases are actually case-sensitive behind the scenes, but masquerade as case-insensitive. To achieve this apparent contradiction they are case-sensitive when identifiers are supplied in double-quotes, but pseudo-case-insensitive when identifiers are not quoted. The way they handle those two positions is to interpret unquoted-identifiers as meaning either all uppercase or all lowercase (and the upper or lower choice varies between vendors). For example Oracle’s default position is that table and column names are upper case. So it doesn’t matter whether I type lower or upper case identifiers, they get converted to uppercase and then compared with only upper-case names in the schema. If I enter identifiers in double-quotes they will be passed through as-typed and compared only with the names exactly matching the case of my identifiers.

As I understand it, Postgres is lower case by default, so an *un-*quoted identifier will always be treated as lower case no matter the case in which it is typed. So it makes sense that the default position is that it should be comparing identifiers in lower-case but what doesn’t make sense is that it should be doing that if it has identifier delimiters set to double-quotes in the Connector’s advanced settings, unless somewhere something is doing some kind of “too-early” conversion to the lower case default.

That is rather odd. Given the Postgres default, it is somewhat unfortunate if your database table names and column names have been created in upper case. I can only assume the table creation scripts had double-quotes round the identifiers. If they hadn’t been, then at least they would be in the (postgres) default lowercase and you probably wouldn’t be seeing this issue. As it is, you are rather being forced to work “against the grain” (even outside of Knime), as it means that any sql query you type will have to contain double-quotes around every identifier.

Just as in Oracle I would never dream of creating lower case table names for that very reason, in Postgres, I would imagine it would make life more difficult to have upper case names in the database.

In some databases (and I just looked at a MySQL installation), a switch at the database level turns off any kind of case-sensitivity, but it does that by totally preventing the creation of identifiers in anything but the default case, to avoid this issue.

This of course doesn’t help much with how to resolve your current issue, for which we will hopefully still find a resolution, but perhaps it might give some background about the possible case sensitivity issue.

Picking up on @bruno29a’s very good suggestion of attempting a query using the DB Query node, what this does mean is that for the query to work, and for it to respect the case of what you type, I would think that you will have to double-quote your table and column names within the query.

3 Likes

Hi Matt,
do you use the Postgres Connector node or the DB Connector node to connect to your Postgres db? If you use the Postgres Connector node and don’t change any advanced settings it should run out of the box without any problems. Did you use the “Custom Query” option in the DB Table Selector node or do you simply select a table?
You can have a look at the generated SQL query via the DB Data port view. Also do you use a custom aggregation method in the DB GroupBy node?
You can also enable the JDBC logger to see which statement is send by the GroupBy node to the database via the Advanced settings tab of the Postgres Connector node. This will write out all statements that are send to the database as debug messages into the KNIME log.
Bye
Tobias

2 Likes

Hi everyone,

Thanks for all your help, the problem is resolved. It turns out changing the PostgreSQL connector to deselect “Delimit only identifier with spaces”, with the delimters being double-quotes, resolved it presumably for the reasons @takbb mentioned. What’s a little odd is that this seems to be the default, but the first time I opened the advance tab was in my initial attempts to fix this same issue, so I don’t know how or when it got changed/I changed it in the first place. Of course this isn’t a problem since its working now. Thanks again everyone!

3 Likes

Hello MPersin,
did you maybe use the workflow migration tool to migrate from the legacy database nodes to the new db nodes? In this case the “Delimit only identifier with spaces” option is selected automatically since this was the old behavior. If you use a new Postgres Connector node the default is to delimit all identifiers.
Bye
Tobias

3 Likes

Yes, that would be it, I had migrated from the old nodes.

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.