Parameterised queries but not in WHERE clause

Hi,

I’m new to Knime and have been looking everywhere for a solution. I’m trying to parameterize a query and run it for a given set of variables. I know I can use Parameterized Database Query node to do this but this seems to only apply to parameters in the WHERE clause.

I’m trying to parameterized the SELECT part too. For example:

SELECT COLUMN1, CASE WHEN MONTH = <MONTH_PARAMETER> THEN 1 ELSE 0 END
FROM MY_TABLE WHERE MONTH = <MONTH_PARAMETER>

I am trying to loop through multiple years and have to run this one month at a time and need to also update the parameter in the SELECT part of the statement.

Any guidance appreciated!

1 Like

Hi @Tty, welcome to the forum!

I’m slightly confused because as it stands a sql statement in the form you’ve described could equally be written as

SELECT COLUMN1, 1
FROM MY_TABLE WHERE MONTH = <MONTH_PARAMETER>

since it would only ever return a row when MONTH=<MONTH_PARAMETER> and therefore your case condition could only ever return 1 for the rows returned by the query.

Am I misunderstanding? :wink:

I’m presuming this was just because it was a simple example of your requirement, rather than the actual requirement…

So I was wondering if you have a more concrete example of what you are trying to do, as it might alter the way this could be approached. As it stands, it sounds like you are trying to write “dynamic sql” rather than just add parameter values.

3 Likes

Just thinking out loud… I wonder if the DB Query Extractor and/or DB Query Injector might be of some use here, combined with some manual string manipulation on the query itself?

3 Likes

Hi @ScottF , I think you’re right. There’s no reason why not and that’s partly why I was interested to know a better example of the requirement, and also whether the facilities of the Parameterized DB Query Reader are being utilised/needed.

That said, I’ve uploaded a demo workflow to the hub here using an H2 database so should work for everybody. It gives some ideas/examples of how the Parameterized DB Query Reader and the regular DB Query Reader can be altered dynamically using flow variables embedded in different parts of the queries.

As you say though, with other DB nodes, the entire query could adjusted or supplied using flow variables, so it really depends on the requirement as to which is the most suitable option.

3 Likes

Thanks @takbb !

The SQL example I gave probably made no sense! :smiley:

You are right that the actual SQL can be re-written in different ways and joined together to achieve the same results, which requires multiple passes on the table - which is ok too.

I was hoping to do this in a single pass by building the parameters into the WHERE and SELECT clause.

The table is a monthly snapshot table with multiple date columns. The more accurate SQL is like below:

SELECT MONTH_NUMBER, COUNT(CUSTOMER_ID) AS ACTIVE_CUST_THIS_MONTH, SUM(CASE WHEN CUST_CLOSED_DATE = <MONTH_PARAMETER> THEN 1 ELSE 0 END) AS CLOSURE_THIS_MONTH
FROM CUSTOMER_TABLE
WHERE MONTH_NUMBER = <MONTH_PARAMETER>
AND CUSTOMER_STATUS = ‘ACTIVE’
GROUP BY MONTH_NUMBER

This gives me a running count of active customers and actual closures for each month and if I’m looking to loop this over multiple years, parameterizing the SELECT clause will be helpful.

2 Likes

Thanks @ScottF

I don’t see it in the version of Knime I’m using (v3.7). Probably a bit behind as I’m working in a fairly restricted environment and getting newer versions deployed can take a while!

Hi @Tty ,

Yes your sql example does make more sense now :wink:, although this could still be written differently to avoid a parameter within the main body of the sql if that causes problems.

Since in the WHERE clause, MONTH_NUMBER will always equal <MONTH_PARAMETER>, you can compare CUST_CLOSED_DATE with MONTH_NUMBER instead of the parameter value, e.g.

SELECT 
   MONTH_NUMBER, 
   COUNT(CUSTOMER_ID) AS ACTIVE_CUST_THIS_MONTH, 
   SUM(CASE 
     WHEN CUST_CLOSED_DATE = MONTH_NUMBER THEN 1
      ELSE 0 
     END) AS CLOSURE_THIS_MONTH
FROM CUSTOMER_TABLE
WHERE 
    MONTH_NUMBER = <MONTH_PARAMETER>
    AND CUSTOMER_STATUS = 'ACTIVE'
GROUP BY MONTH_NUMBER

I don’t know what limitations there are in the KNIME 3.7 nodes, and unfortunately I cannot install it side-by-side with my 4.3 version, but I’m assuming you’ve got what would now be the “legacy” versions of the nodes.

What occurs to me though, is if you are able to bring back your customers into KNIME, you could bring back the summarised data without parameters (or maybe restricted to the year-span that you are interested in),and then let KNIME do its stuff with nodes to derive the remaining information you require.

I assume your table must have other columns on it such as year; and it looks like “cust_closed_date” is also returning a month number, or else some translation is going to be required; but ignoring that, your query could be generalised as follows (without month parameter being supplied)

SELECT 
   MONTH_NUMBER, 
   COUNT(CUSTOMER_ID) AS ACTIVE_CUST_THIS_MONTH, 
   SUM(CASE 
     WHEN CUST_CLOSED_DATE = MONTH_NUMBER THEN 1
      ELSE 0 
     END) AS CLOSURE_THIS_MONTH
FROM CUSTOMER_TABLE
WHERE 
    CUSTOMER_STATUS = 'ACTIVE'
GROUP BY MONTH_NUMBER

This would then bring you back the summarised table across all months. You could then process that returned summary table within KNIME, in a loop if necessary, to separate out the information for each month and do something with it maybe? That might be simpler than parameterising the query. That way you’ve also done a single round-trip to the database instead of repeated calls. Of course that might not fit your use case, but just a thought.

Depending on data volumes and other considerations, you could also just bring back the monthly snapshot table in (or at least a portion of it) and let KNIME do the summarising for you.

As a database developer myself, I have a tendency to want to do as much as possible “in the database”, and my mantra was always “cut down on network traffic: move as little data out of the database as possible” but times change, networks are faster, memory is more plentiful, machines are more powerful, and sometimes it is appropriate to move more of the the logic into the KNIME nodes, especially as it allows you to more easily prototype the manipulation of the result set.

So possibly some form of database reader node … I am using Knime 4.3 but both the Database Reader (legacy) and the DB Query Reader can do the same thing:

My (probably nonsensical) sample data:

… summarised result set that you could then act on with further nodes:

image

4 Likes

You are absolutely right @takbb !

Sometimes it takes talking to others to realize my own mistakes and the solution is staring at me in my face. :sweat_smile: Realised that after typing out the code myself :smiley:

I just need to transform the closure date (it is a timestamp) to a Month_Year format which will work too.

Thanks for your help!

2 Likes

This topic was automatically closed 182 days after the last reply. New replies are no longer allowed.