Knime & GitHub - Agile ETL-Collaboration, Solid Backups, and Effortless Workspace Synchronisation

mwiegand · March 25, 2025, 9:10am

You are most welcome. Git LFS I use for severral reasons:

Some data, like results saved in binary format, I’d not push to git
Performance reasons
(Data) Safety
Not requiring any sort of versioning of non-volatile data
Using LFS just as a temporary storage with the ability to discard it / start anew

My current setup looks as follows:

Workspace on separate exFAT disk
Non-volatile data, like downloaded log files, saved in separate “Knime-Data” folder
Large non-volatile data, like shapefiles of several GB in size, saved in a separate drive managed by Mountain Duck which sync them to AWS S3 and keeps the data in sync across different workstations

My motivation behind that appraoch is to:

User scenario example:

Local machine with plenty of resources
Mobile laptop with limited resources
Team members to whom I can easily share sepearate workspaces, w/ or w/o data
Everyone can pick up work at any given point in time, commit and another can pick up
Backup, restore, document etc. i.e. to mitigate update issues where even Windows can corrupt your data or, as experienced myself, a Knime update can cause workflow corruption

Happy kniming

Best
Mike