DB Reader Performance

Hello everyone,

I am faced with the challenge of processing a large number of files (~ 3 billion) for hot and cold. The data comes from a database and should be written to different nodes, depending on whether they are classified as cold or hot. The problem or bottleneck is the DB reader, as it has to read each line and then pass it on to one of the next two nodes.

Do you have any ideas as to whether the whole thing can be built differently and faster?

Thanks in advance! :slight_smile:

To start, look at this example

Streaming setup is here

2 Likes

Thank you very much for your recommendation and the workflow example!

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.