Help with hierachical clustering

hotadnama · June 29, 2021, 5:21am

Hi all, how can i cluster correlations so i can achieve a quasi-diagonalisation, a seriation algorithm?
Matrix seriation is a very old statistical technique which is used to rearrange the data to show the inherent clusters clearly. Using hierarchical clusters , we rearrange the rows and columns of the covariance matrix of stocks so that similar investments(variables) are placed together and dissimilar investments are placed far apart. This rearranges the original covariance matrix of stocks so that larger covariances are placed along the diagonal and smaller ones around this diagonal and since the off-diagonal elements are not completely zero. Does anyone know what node i can use

any help wld be appreciated!

aworker · June 29, 2021, 8:24am

Hi @hotadnama

Interesting problem. I may have a workflow that achieves this, but before I would need to be sure that we are talking of the same problem. Could you please post here a link to a reference or a paper which explains the methodology you are talking about (quasi-diagonalisation, seriation algorithm and Matrix seriation) ?

Thanks & regards,

Ael

hotadnama · June 30, 2021, 2:49am

Hi there @aworker , i referred to this lik! The Hierarchical Risk Parity Algorithm: An Introduction - Hudson & Thames (hudsonthames.org)

thanks so much!

mauuuuu5 · June 30, 2021, 3:14am

Hi @hotadnama does this Python implementation refeers to the same algorithm?

hotadnama · June 30, 2021, 4:54am

hi there @mauuuuu5 , im not too sure cos im completely new to knime and the data thing

aworker · June 30, 2021, 8:14am

Hi @hotadnama

Many thanks for such a nice reference to the Hierarchical Risk Parity (HRP) methodology and library you provided here.

The good news are that the academic guys who maintain this web site did a great job implementing in Python the HRP.

As mentioned by @mauuuuu5, the reference you posted contains a Python library useable off the shelf, which should allow you to run on your own data the Hierarchical Risk Parity methodology. Since this methodology is made of several complex steps, the best way to go is to use their Python code and not try to implement it from scratch in KNIME. They even facilitate a Jupyter notebook tutorial to test it out with their data. You could hence install an anaconda environment with Jupyter, download their library and notebook and have a go to it.

Once you have done this, you could try to integrate it as a Python library in a KNIME Python Script node, test it with their data and eventually set up a KNIME workflow with your data. You’ll then have the “best” of two worlds together, KNIME & Jupyter.

So far, so good we could say. The only downside I see is that you said in your last post that you are completely new to KNIME and to data thing (mining?) and as you can understand by now, this is a reasonably involved task to achieve.

May be people in the KNIME forum working too in the field of Investment Management will be willing to get involved in this task, that I really believe it is worthwhile.

Any takers to help @hotadnama to integrate in a KNIME Python node the Hierarchical Risk Parity (HRP) methodology provided here ?

Best wishes,

Ael

system · December 29, 2021, 8:14pm

This topic was automatically closed 182 days after the last reply. New replies are no longer allowed.