I am wondering how KNIME takes care of the increasing need for more disk space. The more extensions are being installed, the more disk space is used (of course). However I was quite “shocked” that my KNIME installation takes more than 19 GB on my disk.
I’d like to understand the installation and update process a bit more detailed:
when a new version of an existing extension is installed, are the old files being deleted?
does every extension use its very own (core) libraries? I can understand that there are dependencies, however I don’t quite fully understand the concept behind it in KNIME (of course Python libs etc, but for the extensions).
Is there some kind of management tool to take care of KNIME disk usage?
Is disk space “a thing” being considered when programming nodes / extensions?
Good question, and yes, especially with many Python extensions installed the KNIME installation can be quite big.
Each KNIME extension can bring the Java libraries it uses, and each Python-based extensions ships a full Python environment. Uninstalling extensions will remove the Python environment, but Java libraries are only removed when the packaging mechanism that KNIME uses (actually Eclipse’s packaging) decides that it’s time to clean up. That’s unfortunately a bit of a mistery .
Regarding Python-based extensions we are working on reducing the disk space. Starting with 5.4.1 we force clean a cache of the Python packaging mechanism that we use (you can also do this in prior KNIME versions by deleting the folder <KNIME INSTALL>/bundling/root). And we want to make extensions share parts of their Python environments to reduce the disk size.
thank you for the insights.
Sounds great, please keep those points in mind when releasing new versions (of extensions). Disk space usage and perfomance in general (as mentioned in other current topics) are always important things to consider for future releases.
The more effective the tool works, the happier the user is