Better solution for clustering?

Hello everyone,

I’m currently working on a project where I have to cluster ~25k entries and I’m using the OPTICS Clustering. Problem is this takes too long. Is there any solution to reduce the time needed? Is streaming or doing it in chunks possible?

Thanks!
Gabriella

Hello @gmad23,

I assume that you run the workflow on your local machine, correct? Before changing anything in the workflow, have you made sure that KNIME can access most of the available memory? You can check that in the Xmx value in your knime.ini file which is usually located in \Program Files\KNIME.
If not enough memory is allocated then you could increase the Xmx value which might speed up the execution of the workflow.

Best regards
Jörg

1 Like

Hello @JoergWas

thank you for your reply! I forgot to mention that I have changed the -Xmx to 4g. I have also tried to execute the workflow in a virtual machine, but it doesn’t really help.

Cheers,
Gabriella

Thanks for the information!
Is it possible that you share your workflow with some sample data? Then I can have a better look.
Also, do you use the “Optics Cluster Compute” to create the model and the “Optics Cluster Assign” node to apply the model? That way, the initial training might take a long time but the application with the “Optics Cluster Assign” node is very fast. It could look like this:
image

Best regards,
Jörg

1 Like

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.