Using git (or another revision control system) for KNIME workflows

Does anybody have any experience on using a revision control system (like SVN, CVS, GIT, ...) for KNIME workflows?

I've had a quick look at the file structure, and it looks like you could do this by just keeping a few files under revision control. You definitely don't want the complete structure, as this also contains a cache of the output data for each node. Is there a detailed explanation of all the files in a KNIME workflow somewhere?

Regards,

Tim


 

1 Like

I've installed the subclipse plugins before and tried to use SVN to version workflows. 

There was a recuring issue of once the workflow is checked in and then edited in KNIME errors occur when trying to commit the updates. 

Would you always want to not keep the data? You could instead reset the workflow before committing if you didn't wish to keep it allowing the option either way?

Hi Tim,

just came to my mind: can you ignore the data in the .gitignore file?

Christian

 

Tim,

I have used Mercurial with my flows in the past by turning the folder in the workspace into a repository, just like I would with a Java project. The data can not be restored when switching revisions, so I got a bunch of error messages, and the repo might have become overly bloated, but everything else worked as it should.

There seems to be an "official" solution, but only in the most advanced server version. (so I don't think you will find a list of the files that easily) On smaller scales, something simple like this should work, but remember to reset your flow before commiting, and no guarantees.


 

For those interested:

I'm currently using a .gitignore file (see below), and commandline git. I tried to re-open workflows that were restored using just the files in the git repository, and KNIME never complained, so this seems to indicate that all the necessary information is backed up to git.

here's the contents of my .gitignore

## KNIME files
.knimeLock
.project
.savedWithData
workflowset.meta

## KNIME directories containing node data
internal/
port_*/
.metadata/
internalTables/

# Packages #
*.7z
*.dmg
*.gz
*.iso
*.jar
*.rar
*.tar
*.zip

 

 

 

 

5 Likes

If you’re still around, how did you find using git for workflow source control overall?

every time anyone open a workflow, all the settings.xml files change little things like “last update” or “last view”, etc…
If you are not alone in your group this behavior generates a lot of conflict. Can anyone fix this problem?
Best solution?

1 Like