is it somehow possible to get the workflow information extracted and have it exported as text?
For example, I have a simple workflow where the nodes have the following names: Read Database DB1 -> Get Joined tables DB1_Table1 and DB1_Table3 --> Filter out missing values for column DB1_Table1_Column4 -> Write to CSV file file1.csv
Then I would like to get somehow all these node names in the order of execution, so I have this "metadata" available as text, to use it as metadata on the extracted data, to know how this data has been analysed as a reference in the future...
I don't know if I explained it well enough, but I just want to get the complete workflow in human readable text...
(i'm a simple Knime 3.1 user) not sure if there's an option that prints in a simple-human way all the metadata node informations of the workflow, i haven't seen it... it could be an interesting future implementation for Knime developers i think...
anyway, you could try this:
when a workflow is created (local project) all the metadata informations and settings are stored into a physical directory of the Knime local repository (the initial path you choose when you start Knime). Here you can see all your project folders and subfolders with the node's names and their order of execution: you can read all the subfolders names, but pay attention that this order number is written once in the node name (when you drag and drop), so you must check first into the workflow that the node name number is exactly the execution order (it can be different if you add-deleted nodes during your project activity!).
There's also a file named "wokflow.knime". You can copy and open it with a text editor (eg. Notepad++) to see its content: you can find all the sequence of execution and all the node's names, under the tag
<config key="node_#"> (where # is the number of execution)
Sorry, it's a code, so this information has to be extracted from this file and re-edited to be "human". To do so you can import it and use the text and string-manipulation nodes...
the node description (if present) is under the code tag:
and you can find it in the file "settings.xml" in the specific node folder.
This is what i know for now,
hope it can help
Just another suggestion:
there's a node called "Timer Info" in the Workflow/automation section. You can run it at the end of your workflow and see in it's output table the list of all node names, id, time of execution etc.
Emas Suggestion are actually perfect. Just one hint the file "wokflow.knime" is a xml file. You can parse it using the xpath nodes and afterwards built a network using the knime network plugin.
This would be a nice example workflow, in case you want to share it with the community.
Thanks for the answers!
I have been testing Emas suggestion:
- It does indeed show all the nodes, but not how they are interconnected, so you don't see the order of execution
- It shows the default name ( like "File Reader") instead of the custom name given (like "read in file file1.csv").
- It does not show the nodes within meta nodes, it just gives the name of the meta node...
So I will go and test the workflow.nime way, using xpath...
I will come back if I have found a nice working solution and share it...
On a second thought, I think it's too complex for me to get the workflow out of the XML file... I think I will stick to the workflow SVG image, and use that as a way to see how data was analyzed...
You could go as far as to transform the XML into a json node-link structure and to built a D3 force graph to illustrate the network.
Geo, that sounds already too complicated for me ;-)
If I have some spare time, I will try to look at what you suggest...
I fully agree. After all, KNIME already provides all the visualisation that you need.
If there are still interested users for this “whole workflow information” they might find Workflow Summary feature from version 4.2.0 useful