Cross Joiner

I found for my case node almost un-functional. In case of cache in memory, all memory used fast and node looks stгсk, does not react on F9. In case to write to disk, the forecast for the completion 70 min, happily reacts on F9. My version 4.1.3 on Win 10.

Can you provide some additional information on the data that you try to join? Maybe provide a table spec and number of rows.

I join about 7K records with about 800 symbols in text field.

you cannot use the CrossJoiner on such a big table. So, yes you can. But you need to keep in mind that it will create the Cross Join of both tables. Hence inputting 7k*7k comes to 49Mio Rows. This takes its time to create such a big data set.

However if you want afterwards to reduce the rows back, e.g. using a specific filter like the Rule-based Row Filter you can stream it.
The first port of the cross joiner can be streamed. By activating streaming for this task, there will never be the full table created and it will be a lot faster.

I made a quick test. I used the Data Generator and cross joined with itself. Afterwards I filter for the first feature being equal with the Rule-based Row Filter

This took me 9s if I stream it, and around 4 minutes unstreamed.


Thank you, @Iris. The streaming with Rule-based Row Filter is very fast and not resource hungry. Now it takes just couple of minutes.


This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.