Knime & GitHub - Agile ETL-Collaboration, Solid Backups, and Effortless Workspace Synchronisation

You are most welcome. Git LFS I use for severral reasons:

  1. Some data, like results saved in binary format, I’d not push to git
  2. Performance reasons
  3. (Data) Safety
  4. Not requiring any sort of versioning of non-volatile data
  5. Using LFS just as a temporary storage with the ability to discard it / start anew

My current setup looks as follows:

  • Workspace on separate exFAT disk
  • Non-volatile data, like downloaded log files, saved in separate “Knime-Data” folder
  • Large non-volatile data, like shapefiles of several GB in size, saved in a separate drive managed by Mountain Duck which sync them to AWS S3 and keeps the data in sync across different workstations

My motivation behind that appraoch is to:

  1. Keep the Workspace size as small as possible
  2. Sync temporarily generated data via LFS
  3. Be able to start, stop, sync and resume while on the go

User scenario example:

  1. Local machine with plenty of resources
  2. Mobile laptop with limited resources
  3. Team members to whom I can easily share sepearate workspaces, w/ or w/o data
  4. Everyone can pick up work at any given point in time, commit and another can pick up
  5. Backup, restore, document etc. i.e. to mitigate update issues where even Windows can corrupt your data or, as experienced myself, a Knime update can cause workflow corruption

Happy kniming :wink:

Best
Mike

1 Like