Hello, I’m noticing a strange Redshift error when undertaking database grouping using either the ‘Database GroupBy (legacy)’ node or the Database Query (legacy) node. This specifically relates to using a custom Listagg
aggregation function. This has worked for me okay in the past but is now (as of some time in January 2020) giving the following error:
[Amazon](500310) Invalid operation: One or more of the used functions must be applied on at least one user created tables. Examples of user table only functions are LISTAGG, MEDIAN, PERCENTILE_CONT
To test this, I’ve created the following table on my Redshift cluster:
category item
a car
a ball
a bike
b red
b green
I’m seeking to produce a list aggregation grouping on this table to give an output like
a car|ball|bike
b red|green
I can achieve this without problems by running the following query from an SQL editor:
SELECT category,
listagg(distinct item, '|')
within group (order by item)
over (partition by category)
AS aggragadabra
FROM schema_name.listagg_test;
However, if I run the same expression from a Database Query (legacy) node (of course using #table#
instead of the table name) I get the above error. I also get the same error when using the Database GroupBy (legacy) node (with a custom aggregation).
It seems to me that something about the way KNIME is sending the query to Redshift is making Redshift want to run the query on a leader node, rather than only a compute node. It then issues the above error, as flagged in:
https://docs.aws.amazon.com/redshift/latest/dg/c_SQL_functions_compute_node_only.html
Much appreciate any thoughts or suggestions.
Andrew.