Databricks DB Group by Manual Aggregation Error

benpope · January 22, 2024, 5:37pm

I’m using Knime v5.1.2 with the latest DatabricksJDBC42-2.6.36.1062 driver. I’m finding that the DB Group by node is throwing an error when trying to use any of the aggregation functions on the Manual Aggregation tab. Possible bug with interacting with Databricks? Work as expected with other DBs. Does not appear to generate the correct SQL when using Databricks looking at the console log.

Creates SELECT * FROM (SELECT COUNT() AS COUNT() FROM (SELECT * FROM catalog.schema.table)

Count function works as expected when generated via manual SQL.
SELECT count(*) as count_rows FROM catalog.schema.table

Anyone else finding this error?

sascha.wolke · January 23, 2024, 12:28pm

Hi @benpope,

Thanks, reporting this. Can you post a screenshot of your group by configuration dialog?

Cheers,
Sascha

benpope · January 23, 2024, 1:57pm

Thanks for getting back to me Sascha. The Group By node works as expected if I only add fields to the Group tab but adding any type of manual aggregation causes the node to throw an error with Databricks. I don’t even get an option to run the node, I get this message>
WARN DB GroupBy 4:1456 java.sql.SQLException: [Databricks]JDBCDriver ERROR processing query/statement. Error Code: 0, SQL state: 42601, Query: SELECT * F***, Error message from Server: org.apache.hive.service.cli.HiveSQLException: Error running query: [PARSE_SYNTAX_ERROR] org.apache.spark.sql.catalyst.parser.ParseException:
[PARSE_SYNTAX_ERROR] Syntax error at or near ‘(’.(line 1, pos 45)

benpope · January 25, 2024, 6:39pm

Also seeing a similar issue with the earlier Knime 4.6.2 version.

system · April 24, 2024, 8:05pm

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.