Take so much time to open the dialag node

helfortuny · January 27, 2023, 12:13pm

Hi!

My dataset is based on 183000 columns and 600 rows. This data is imported from 15 different excel thorugh the Excel Reader node. I have increased the RAM memory that the KNIME can support to 14000 MB and it is enough to store all the data I have. Now I would like to implement different models and make further arrangements with data. Now, the problem I have is that it takes so much time to open a dialag node from any node. How can I solve this issue?

Thank you so much in advance!!!

helfortuny · January 27, 2023, 12:48pm

I correct myself. It’s only the Random Forest Learner node dialog (taking into account the nodes I have for the moment in my workflow).

ScottF · January 31, 2023, 4:41pm

Hi @helfortuny -

It sounds like the problem is connected to how wide your dataset is. 183,000 columns is quite a lot, especially relative to your 600 rows. The Random Forest learner is going to have to load all 183K columns for you to make selections from, hence your problem.

I would suggest that some dimensionality reduction is in order here. What type of dataset are you dealing with? Is this perhaps a Document Vector matrix for text analysis, or something else?

helfortuny · January 31, 2023, 5:00pm

Most of data are audio signals.

ScottF · January 31, 2023, 5:03pm

Ah, OK, thanks. The fundamental question is: do you really need all 183K features? Not only is it going to affect the configuration of the individual node, but also the overall performance of the model, as well as its interpretability.

system · May 1, 2023, 5:03pm

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.