Calculating network distance by using edges and shortest path node

Taita · August 30, 2016, 11:49am

I'm exploring the opportunities of the use of Networks in Knime to calculate the distance between nodes. I created a test-workflow with nodes and edges and added the distance as a weight in the object inserter by using the weight column. I can see this information if I view the network analyzer - edge statistics table. I added the shortest path node to find the shortest path between two nodes. I can visualize the shortest path and review the result via network to row node. But unfortunately the shortest path node seems not to take the weights of the edges into account. The result is always the number of nodes and edges not the weight of the edges. This is independent of the weight policy in the shortest path node. Do I make use of the right nodes to calculate the weighted distance between nodes or do I something wrong?

Taita · September 3, 2016, 11:05pm

Is my question unclear or am I just impatient?

marco_ghislanzoni · September 5, 2016, 8:42am

If you could share your test workflow it would be much easier to have a look at the potential issues and offer some suggestions.

Cheers,
Marco.

Taita · September 5, 2016, 7:27pm

Ok, I enclosed the workflow and the table. Hopefully this makes clear what I want.

marco_ghislanzoni · September 6, 2016, 9:17am

Hi,

I looked at your workflow and went back to your original post above. I am not 100% sure I fully understand what your issue is, but if it is the fact that the weights are missing within the Network Viewer after you apply the Shortest Path node, this should be solved by simply ticking the "Copy features" option within the node configuration dialogue. Or is it rather a calculation issue you are referring to?

Cheers,
Marco.

Taita · September 8, 2016, 8:15am

Hi Marco,

Thanks for examining my flow. The issue is this. I want to use the network functionality to determine the shortest distance between nodes based on the weights in the edges. Background is that I use the network as ontology. This distance will be calculated for various words in a document by looping. I expected the shortest path node to calculate with the weight of the edges. But it seems not to use this weight. The problem is not the visualization

If you take another example for instance towns as nodes and edges with the distance between the nodes. How do you calculate the shortest path between two towns?

Cheers, Taita

marco_ghislanzoni · September 8, 2016, 10:31am

Hi Taita,

I had a closer look at your network. I think the reason there is no difference between a shortest path calculation with or without the weights it's due entirely to its (star) topology. You have a central node, Functie, which is connected with weight 1.0 to five other nodes (let's call them secondary). These nodes in turns have a number of terminal nodes each one with an own connection weight to their parent.

Since in the bottom part of your workflow you are always looking for the shortest distance between a terminal node and "project manager", which is one of the five secondary nodes, there are basically no alternative paths to chose from, regardless of the weights. The only possible path always goes from the terminal node to its parent (secondary node), then to Functie and finally to "project manager".

In other words, there is always one and only one path connecting your terminal nodes to the "project manager" node, hence regardless of the weights that would be the only result you get as shortest path. It's like asking your GPS to calculate a route between two cities when only one road exists between them. You will always have to go through the road!

I did some tests with a different network, where alternative paths exist, and I can confirm that the weights are properly taken into account regardless of the Copy features parameter. The only slight misleading piece of information if you wish is in the output window of the Shortest Path node. If Copy features is left un-ticked, it shows weighted: false, which has to be interpreted in relation to the output network (aka shortest path) not having inherited its weights from the original one and not to the fact that no weights were applied to the calculation of the shortest distance.

Hope this makes sense.

Cheers,
Marco.

Ergonomist · September 8, 2016, 1:48pm

Marco,

I applaud you for your committment to this community!

-E

marco_ghislanzoni · September 8, 2016, 2:43pm

Thanks Ergonomist!

Helping out others is a great way to test your own knowledge, stay sharp and ultimately learn more. And once you stop learning you start dying, said once a wise man... :-)

Cheers,
Marco.

Taita · September 9, 2016, 11:37am

Hi Marco,

I am very happy with your contributions and the community too. I realize I dont have the most easiest questions and sometimes act on the edges of the possiblities of this excellent peace of tooling. As I dont have more experienced Knime users in my surroudingthis forum for me is the only possiblity to make steps when I am fixed with my solution.

Regarding your suggestions I will try a flow with alternatives to the shortest path and enter high weights on the edges of these alternatives. I keep you updated. Besides I expected to have made a tree topology, but apparently there is no difference between this and a star?

Thanks, Taita

marco_ghislanzoni · September 9, 2016, 12:27pm

Hi Taita,

no worries, I am glad to be able to help you as much as possible. :-)

Back to your network, playing with the weights but leaving the topology unchanged will have no effect on the shortest path determination because you will still have one and only one path between your test nodes, no matter what.

Also switching to a tree topology is not going to help much. In fact I named your network a star network because this is the way KNIME graphically represents it, but more appropriately I should have called it a tree network. A star network is actually a special case of tree network where only one node (the center of the star) has degree > 1 while all the others have degree 1. This would be the case with your network if you would limit it at the central node plus the 5 subnodes children of Functie.

Regardless of the nomenclature, if you want the Shortest Path algorithm to work as expected, you need to introduce one or more cycles in your network This is like asking your GPS to bring you from city A to city B where some alternatives routes exist, say one going through city B1 and another one going through city B2. The weights are telling which one is shorter or faster or more ecological. Try experimenting with this simple mesh network:

Node1	Node2	Weight
A	B1	0.5
A	B2	1.0
B1	C	1.0
B2	C	1.0

This network contains one cycle. If you calculate the shortest path from A to C using the weights, you should find that it goes through B1 because overall A --> B1 --> C has a lower weight than A --> B2 --> C. Now invert the weights of A --> B1 and A --> B2 and recalculate the shortest path. This time it should go through A --> B2. Now switch off the use of the weights in the Shortest Path node (set it to None). Now the two paths from A to C, the one going through B1 and the one going through B2, are absolutely equivalent and you may get one or the other depending on how the algorithm is forking at each run (or you may always get the same one depending on the implementation).

You can play with this network by adding intermediate nodes or alternative paths from A to C and checking how considering or not considering the weights is affecting the shortest path calculation.

Have fun! ;-)

Cheers,
Marco.

Taita · September 12, 2016, 8:02am

Hi Marco,

Thanks for your extended post. I tried your network with my nodes. Unfortunately I see the same behaviour:

The shortest path node determines the right path and takes care of the weights of the edges. But in the output I I see the sum of the nodes involved and not the sum of the distance of the edges I expected. Perhaps I expect something of the network that is not supported by the network method: is it possible to calculate the sum of the edges of a calculated shortest path?

Cheers,

Taita

Taita · September 28, 2016, 1:32pm

Unfortunately I haven't received a reply on this. Is my finding a limitation in the shortest path node or can the behaviour be considered as a bug?

Calculating with distances by using edges is important for me for several use cases related to NLP.

Additional question: Some of the use cases make use of the random walk model. I cant find a node with this name or are other ways possible to mimic this functioality?