I’m currently working on a project where I have to cluster ~25k entries and I’m using the OPTICS Clustering. Problem is this takes too long. Is there any solution to reduce the time needed? Is streaming or doing it in chunks possible?
I assume that you run the workflow on your local machine, correct? Before changing anything in the workflow, have you made sure that KNIME can access most of the available memory? You can check that in the
Xmx value in your knime.ini file which is usually located in
If not enough memory is allocated then you could increase the
Xmx value which might speed up the execution of the workflow.
thank you for your reply! I forgot to mention that I have changed the -Xmx to 4g. I have also tried to execute the workflow in a virtual machine, but it doesn’t really help.
Thanks for the information!
Is it possible that you share your workflow with some sample data? Then I can have a better look.
Also, do you use the “Optics Cluster Compute” to create the model and the “Optics Cluster Assign” node to apply the model? That way, the initial training might take a long time but the application with the “Optics Cluster Assign” node is very fast. It could look like this:
This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.