Streaming with database access

Hi all,

I have a use-case where product data is pulled from the database and for every product one chart is produced and outputted as image. Currently I pull all data and iterate with a group loop, but that is rather slow as the dataset is huge.

I was wondering if there are better ways to do this: e.g. pull one product at a time within a for loop and produce the images while the next product is being pulled from the db. I am just not sure if this can work in a streaming approach.

Any hints or suggestions are welcomed.

Thanks & regards,
Denis

Hi @d3n1s,

I have a few questions:

  1. What kind of charts are you producing in each loop iteration?

  2. How many unique products do you have in your dataset?

A streaming approach may help in decreasing execution time for some of the nodes in your workflow. I would check to see how many streaming-enabled nodes you have in your workflow and combine those into a component for streaming. If you’re interested, here’s more information on streaming with KNIME (Streaming data in KNIME | KNIME) and an example workflow (Streaming - Text Processing – KNIME Hub).

Some more tips for optimizing your workflow can be found here: Optimizing KNIME workflows for performance | KNIME

I hope some of the resources I’ve provided help! :slight_smile:

Cheers,
Dash

4 Likes

Hi,

thanks for the reply.

  1. I am producing Histograms for each product using a custom Python code within a Python View node.
  2. Number of unique products is under 100.

The Python Views are relatively slow, so I was thinking while the Python View is executing, the next product could have been pulled from the db.

Regards,
Denis

Hi @d3n1s,

Based on your comments I believe that integrating streaming into your workflow would be a great way to speed it up. I would also recommend trying that in conjunction with Columnar Table Backend.

What version of KNIME are you using?

Cheers,
Dash

1 Like

Hi Dash,

I am using KNIME 4.5.2

Thanks,
Denis

Hi @d3n1s,

Thank you for the info! Have you had any success so far with integrating streaming into your workflow?

Cheers,
Dash

Hi,

have not reached that point yet. I will post it here for sure in case of success.

Regards

Hi @d3n1s,

Please do! And if you had questions about how to integrate Columnar Table Backend into your work here’s a good guide: Inside KNIME Labs: A New Table Backend for Improved Performance | KNIME

Cheers,
Dash

1 Like