Here are the open questions which @Antonina_P just provided the answers to Thanks Antonina:
Q: How do you build, iterate and improve one big workflow in a team? How did you manage collaboration and versioning to this one big workflow? Did you assign parts, e.g., data acquisition, processing, to different members and then assemble these parts?
A: It depends on each use case, if two people are working on the same workflow, it is easier to organize a meeting and just sit down and do it together, splitting the tasks as you go. Another best practice is to use annotations at the beginning and create a “sketch” of the future workflow and then split the responsibilities (also using different colors on the annotation boxes). Assembling the workflow back together by copying and pasting the relevant nodes back together is easy with KNIME.
Versioning is also quite important, we usually keep an archive of older versions of the workflow for a while after releasing the new version, just to be sure that the new one is stable enough.
I think a lot will change for us in the future when we switch to KNIME Business Hub, where collaboration will become even more comfortable.
Q: What sort of governance do you have in place around the use of KNIME in the larger organisation? Especially when other teams start to onboard to the use of KNIME but your team appears to be “the owner”.
A: Since KNIME is open source itself and offers loads for resources, the need for typical governance and onboarding from a central instance is not as big. Because we were really invested into making KNIME big in the organization, also outside of the supply chain, we were really happy to support other teams trying it out, by having short calls about how we use it and how we started. Most of the time it is enough to get them going.
However, once it becomes bigger in a company, you start to realize that it would be nice to have someone who feels responsible for KNIME, a product owner. I would recommend to have this role as part of the IT organization, who also usually organizes the Business Hub, can define how its resources are assigned and define the rules for using it.
Q: I am an avid KNIME user, and a Supply chain professional, so great to hear this talk what are your thoughts on modularity / exception handling in workflows? Typically , when we build the workflow, it is for the ideal good data situation. We sometimes do not know/expect the kind of quantitative / nomenclature anomalies in the data stream. For the modularity , I mean to ask with the modularity, sometimes the aggregation levels change for the analytics from user to user.
A: I think it is normal to have exception handling as part of your normal process of creating a workflow - so I do suggest to expect it to happen. It is a part of data wrangling/cleansing and is very important to do at the very beginning of the workflow in order to ensure good results at the end.
However, if you have to spend a lot of time (take 30-50% as a rule of thumb) only on exception handling, you might want to consider looking for the root causes why this happens. Is it coming from the source systems? Is the process behind this data not standardized? What can be done to make the data “cleaner”? Sometimes not being able to work with the data in the first place is also a very good finding, helping improve the processes behind it.
Now as per modularity (thanks for elaborating!), partially because of users wanting to analyze on different aggregation levels, we like to take the lowest possible. It is the so-called “operative level” - the level of detail required for an operative employee to take action, e.g. material & storage location. In the Dashboard we then visualize the data in a way that tells the user a story: from the top-aggregated level, via some drill-downs, down to the most detailed one (pre-sorted in order of priority for action-taking). This way the user can decide on which level they want to stay. But the data in the background always has the lowest aggregation level necessary.
Q: Did you implement LLMs (e.g. classification of specific data)?
A: Not yet in the supply chain, we still have loads of quantitative data to sift through and predictions to make based on it. But I did find LLM useful in explaining some of the supply chain specific terms to non-specialists: “Please explain the difference between safety stock, safety time and reorder point like you would to a four year old.”