I am currently running 100+ SQL queries which all use the same 'identifier' (ID) column.
I am hoping to find out if there are any correlations between the SQL queries to discover:
- If any data sets are subsets of other data sets
- Which data sets are highly correlated
All the queries have the identifier column ("ID") in column 1.
Alternatively, it is easy for me to join all the results together in 1 table so it is organised like:
Rule1-------------ABC
Rule2-------------ABC
Rule2-------------DDD
Rule2-------------ZZZ
Rule3-------------DDD
Rule3-------------ZZZ
Is there some node that has any sort of input that may return results such as:
------------------Rule1------Rule2-------Rule3
Rule1-------------------------50%---------0%-
Rule2------------100%--------------------100%
Rule3-------------0%--------100%-------------
And (if possible), something that compares >2 data sets:
---Match-----------Rules---
x% 1, 2, 3
Currently, I'm using something like the following. But I don't understand what 'SUPPORT' means in the last output table. I'm not sure it gives me a correlation value as well...