I’m using Ilastik for pixel classification on bio images and I’m experiencing some serious bottlenecking at this node. I noticed that the node execution can’t be chunked or parallelized in any way, so I’m hoping the correct configuration will speed things up.
The machine I’m using (an AWS instance) has 120G of RAM available, which reaches a max of 7% usage when Ilastik is executing.
So, I would like to use the absolutely maximum values allowed for thread count and max memory, so Ilastik can make full use of the resources available.
Current settings on the Ilastik node (default after installation):
More info about my machine:
sorry for the late reply, I would recommend you to just experiment with these values, but it is quite possible that your task execution time is not bound by the computing done in Ilastik but by the I/O required to transfer the images to and from it. In that case more memory will not help you. What you can try is to start several ilastik instances at once, e.g. via a parallel chunk loop.
Thanks for the response.
I’ve been tinkering with the CPU thread count configuration, but no success.
Parallel chunking doesn’t work with Ilastik. It seems like a single ILP file can’t be shared by the parallel instances.
I split the data with a Row Splitter and tested 2 physical Ilastik nodes in parallel. Both failed when linked to the same ILP. So, I duplicated and renamed the ILP file for the 2nd node and it executed successfully.
I repeated the process with 4 Ilastik nodes in parallel (linked to 4 renamed ILP files, identical in content) and it also executed successfully. However, 4 Ilastik instances in parallel didn’t execute faster than 2 or even 1 instance.
In all, seems like you are correct. My data is being choked somewhere to/from Ilastik and it isn’t Ilastik’s fault.