Creating a co-occurence matrix from a Term co-occourence counter

Dear all. I'm a new user to this fantastic piece of software. Been using it for a few days, already done some interesting stuff. But right now I am stuck: I am trying to build a co-occurence matrix visualization from a table generated by a "Term co-occourence counter".

Each row of the co-occourence table has a "Term1" and "Term2" columns (of Term type) and one "Document co-occourence" column (integer type) which indicates the number of times Term1 "co-occours" with Term2 in the document.

Question how can I generate a matrix view with all the terms listed in both x and y axis, like this example?

http://dh2016.adho.org/static/data/673/image10.jpg

Thank yiou for your time!

Bruno.

 

Hi, I found a way to calculate a co-occourence matrix, but using R and the ploting a Heatmap. So you have to "link" Knime with R, so you have to go File/Preferences/R....

Let me know if it works for you.

Best Regards

 

 

 

 

Thank you mauuuuu5! After installing R, xQuarts and all dependencies I got your workflow working and it's very handy!  

EDIT: I think something may be not working ... in the image output of your workflow I am getting "AA AB AC AD AE" in both X and Y axis.

Still, I can't directly use the output of my co-occourence table, since I need to take into account the number of co-occourences and add rows to my table as necessary: i.e. if I am understanding the algorithm correcly, it means converting this:

rowID term1 term2 sentence co-occourence
row0 AA BB 4
row1 BB CC 2
row2 AA CC 2

into something like this:

rowID term1 term2
row0 AA BB
row1 AA BB
row2 AA BB
row3 AA BB
row4 BB CC
row5 BB CC
row6 AA CC
row7 AA CC

I guess I will need to do collumn / row filtering, grouping and pivoting right? Thanks for the help!

Hi I am glad that the workflow worked, and yes the data should be displayed like the second table. 

Best Regards

Something must be wrong. Only variables from "A" columns (AA, AB, AC, AD, and AE) are displaying in the matrix. Variables from "B" column are not appearing ...