In our permission for release are a set of nodes to replace (or complement at least) the gzip nodes (Glad you like those by the way )
Hereâs a preview showing the internal version:
The âArchiveâ nodes handle archives (with or without compression - e.g. Zip - with compression, tar - no compression), and the âCompressionâ nodes handle non-archive compression formats (gzip, bzip2 etc)
This is the sort of option you get (this for the Expand Binary Objects Archives node - the others are similar):
You will see that there are also some handy features in there for efficiency, like being able to filter the paths of the files in the archive which get expanded (Writing BLOB cells in KNIME is really quite IO heavy, so only creating the ones you need in the first place is much more efficient than expanding the whole thing and then row filtering afterwards)
As well as specifying the archive or compression format, there is also a âguessâ option - which tries to do exactly that, on a row-by-row basis (obviously, you donât get the full control over all the format-specific settings that way!)
I need to add a couple of additional features (security settings to prevent âdecompression bombsâ - Zip bomb - Wikipedia) and then find the opportunity to get it out to the public release. Thatâs not going to be today(!), but I would hope we can roll it out in the not too distant future
Regards compound BLOBs - e.g. .tar.gz - you can put a Decompress followed by an expand archive and with the right settings itâ just worksâ!
More anon - I will pencil this in as our 1.38.0 release
Good news is I have managed to add the security options mentioned to the internal version, so I think I can press on with releasing in the next week or so
Unfortunate but maybe we can meet at another Knime event in the future. Would love to shake hands and show my gratitude for your awesome contribution in person
Indeed - you too (and if you find yourself in or near Cambridge, UK, then do give us a shout!)
We nearly made it, but unfortunately it was very difficult to justify with very little cheminformatics / life sciences on the agenda for 3+ days out of office (not to mention leaving home at 4.30am and arriving back at around 1am due to the rubbish flight times )