I would like to ask if anyone has any best practice guidelines for using KNIME Server, particularly regarding folder structure, workflow management (naming conventions, documentation, deployment), and data management.
We are a small team that will start using KNIME Server more intensively. Before migrating all our local workflows to the server, I am curious if anyone has any materials or resources that explain how to structure folders and workflows logically on the server. If you don’t have materials but can share examples that work well in your organizations, that would be also great.
So far, I am considering the following folder structure:
KNIME Server
Development
1.1 User1
1.2 User2
…
Staging
2.1 Workflows
Production
3.1 Workflows
3.2 DataApps
3.3 WebServices
SharedResources
4.1 Data
4.2 Components
4.3 Templates
Documentation
I am very interested to hear about the structures you use or any guidelines you follow.
I can add these points but it might depend on the nature of your projects. Will they run longer or will you have use cases that are constantly shifting.
Two organisational patterns I like:
years - I like to have stuff in such folders. They will age out to a certain degree and you can remember when this was
project folders that have some sort of ID number while the first character is a letter. Maybe you have a Jira dashboard or project number and you then add a description. That way you can search for the number. Often you would also remember it and just refer to it by that.
Maybe two very short layers and then longer workflow groups. So you will drill down fast but the longer names will not fill up your screen.
SALES /
2022 /
a_REP_45677_prepare_reporting/ (folder)
data (folder)
script (folder)
result (folder)
m_TEAM_B_2022_675654_sales_machine_learnig (workflow)
m_TEAM_B_2022_675654_sales_machine_learnig_preparation
Since I have been around for some time I like to stick to no special characters and only underscores to separate folders
Inside folders I like the workflows to have again a character and then a numeric order in steps of 5 or 10.
a_REP_64646_machine_learning_project (top folder)
data (folder)
script (folder)
result (folder)
graphics (folder)
b_001_load_data (workflow)
b_005_preprocessing
b_010_machine_learning
b_050_evaluation
This way you have some order, but you also are able to insert additional workflows if needed without having to redo the whole thing. Also maybe use relative paths in the workflows towards your data folders (…/data/) so you can move the whole top structure without breaking anything.
Also you can export the whole top folder (maybe reset it) and leave out the /data/ folder and you have just your logic and structure without large datasets.
Also I like to develop workflows on a local KNIME version and then upload this to the server with a description about the changes.