Discover correlations between different data sets?

I am currently running 100+ SQL queries which all use the same 'identifier' (ID) column.

I am hoping to find out if there are any correlations between the SQL queries to discover:

  • If any data sets are subsets of other data sets
  • Which data sets are highly correlated

All the queries have the identifier column ("ID") in column 1.
Alternatively, it is easy for me to join all the results together in 1 table so it is organised like:

Rule1-------------ABC
Rule2-------------ABC
Rule2-------------DDD
Rule2-------------ZZZ
Rule3-------------DDD
Rule3-------------ZZZ

 

Is there some node that has any sort of input that may return results such as:

 

------------------Rule1------Rule2-------Rule3
Rule1-------------------------50%---------0%-
Rule2------------100%--------------------100%
Rule3-------------0%--------100%-------------

 

And (if possible), something that compares >2 data sets:

---Match-----------Rules---
     x%                1, 2, 3

 

Currently, I'm using something like the following. But I don't understand what 'SUPPORT' means in the last output table. I'm not sure it gives me a correlation value as well...

http://i.imgur.com/3Rd0R.jpg

You might want to look into the Scorer node that provides a Confusion matrix and the Set Operator - for your second problem. The SUPPORT as returned by the Association Rule Learner means the number or percentage of items covered by a particular itemset.