Data Processing

ahmed_gomaa · December 17, 2019, 11:53am

How can process the data at the DB engine side and just receive the result? In other words, we want to process the data at the DB engine to save the HDD space.

ScottF · December 17, 2019, 2:59pm

Hi @ahmed_gomaa -

In general, the DB nodes push down whatever processing they can to the database server, and only retrieve data locally when you tell them to cache some result. Take the following workflow from the Hub as an example:

On the bottom branch, all of the aggregation and filtering functions are taking place on the DB side, as indicated by the square DB ports. Only at the very last node, the DB Reader, are any results brought back to KNIME table form (black triangle port).

ahmed_gomaa · December 22, 2019, 7:58am

In first, thank you for supporting.

I took a look about the nodes that belong to the DB category in the nodes repository. Those nodes make the traditional known DB processing. Are there another nodes can be used for the text analysis (clustering/detection, classification) nodes (like as Strings To Document, etc) which run on the DB side directly?

ScottF · January 2, 2020, 3:34pm

Hi @ahmed_gomaa -

You can push some of the modeling to the server side using our Spark extension, assuming you have a Spark cluster available. But unfortunately we don’t support push-down for the text processing extension at the moment - sorry about that.

denizkonak · June 17, 2021, 1:22pm

Hi @ScottF ,

I tried to take the linked wf but I couldn’t find it at the given link.
Could you please direct me if it is still available at any other link?

Thanks,
Deniz

KKERROXXX · June 17, 2021, 1:39pm

Hi @denizkonak ,

You can find the wf mentioned by @ScottF below,

10. Database - solution.knwf (91.3 KB)

Best,

ipazin · June 17, 2021, 1:45pm

@ScottF link has been updated.
Ivan