Derived fields frim DB nodes

Hi!
Is there any possibility to create\adjust fields while workin ‘in-database’ in KNIME?
For example, I want to fill all nulls in my table that i select, then create several other fields, and only after that group by and go to local enviroment.
If there any possile ways to do it not writing ‘case when’ \ ‘coalesce’ in db query nodes?

Any help is greatly appreciated!

Hi @s_borisevich , when you say “while working in-database” I presume that you mean working with the in-database db nodes in the workflow.

Although those nodes are called “in-database” they ultimately are working together to construct a single query that is executed only at the point where you attempt to read the data (e.g. in the DB reader), so your work is not really “in-database” it is simply that KNIME constructs all of the required SQL (such as joins) across a series of nodes rather than having the data returned separately to KNIME and joined locally, and so the joins are executed “in database” in a single query, rather than locally.

Therefore there is no ability to perform other operations within the operation of a single series of “in-database” nodes.

You can certainly perform individual operations using other database nodes but are you actually talking about making physical updates to the database tables themselves, or are you wanting to dynamically adjust only the data that is returned?

Are you able to give a more detailed example about what you are wanting to do?

Yes, I’m talking about adjusting the data that is returned, not physical updates to the tables. I assume such updates can be done by db writer\ db insert.

I will try to explain myself.
Let’s say, I’m working with some table that is too large to process it efficiently in local memory before some preprocessing on db engine.
So, before going ‘local’ what we need to:

  • some fields have nulls, so I want to replace theese nulls with some relevant values
  • also, I need to transform some continuous features into categorical (i think here db binner node already exists)
  • at last, I want to replace some substrings in string field, or, let’s say, map some values into another

All theese manipulations have to be done sometimes even before joining other tables in database, others before grouping and selecting relevant rows.
So, my question is there any way to do it in KNIME without going to db query node and writing actual sql, using case when\coalesce\replace and other functions?

Such transformations can be done when writing sql code. I am not aware of DB Nodes in KNIME which cover all what you require. Best would be you check the available DB Nodes in the node repository yourself and see what you can cover with these (DB Groupby, Column Filter Row Filter,…) and what you need to write SQL for.
@DB experts please correct or add
br

1 Like

Hi @s_borisevich , yes @Daniel_Weikert is right. The primary manipulation nodes in KNIME work on local data tables rather than at the database query level.

There are nodes such as DB GroupBy and the DB Row Filter whose job it is to enable use of the sql “group by” and “where” clauses respectively to avoid sql, but there isn’t a lot in the form of “db transformation” nodes which could represent case statements and other sql clauses within the query.

I have put together this db demo using the H2 database with a trivial example of a couple of things that can be done, but these don’t remove the need for writing sql queries.

In database queries.knwf (41.4 KB)

My main reason for uploading this was to discuss how the in-database nodes operate. To see this more clearly you really need to switch the the KNIME Classic UI if not already using that, as you can then view the DB Queries that the in-database nodes build.

If you open the view port on each of the in-database nodes, in Classic UI, there is a DB Query tab. You can see that as you proceed downstream, the queries from each get stitched together so that at the point where you execute the DB Reader, there is a single SQL query containing each of the upstream elements.

This means that it is therefore possible to inject your own SQL elements in the form of flow variables as I have done with the DB Query towards the centre of the image.

This receives a flow variable created by the upper branch that has turned a local lookup table into a series of Case clauses:

and this is applied by injecting into the sql:

Meanwhile in the DB Table Selector for the “Users” table, instead of simply selecting the table, a SQL Query can be entered to provide a view on that table, which handles null values, e.g.

but again it is writing sql although it potentially still reduces some complexity over writing the whole thing as a single SQL query.

So there are some tricks that you can use which may make your life easier (or more complicated!).

hope that helps.

2 Likes

Thank you very much!
That demo is really helpful.

2 Likes

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.