Matching presence of particular columns and a row value



I have a table with m columns and n rows. All the values are 1. This indicates that the entities representing the rows are positive for all columns (representing descriptors). All columns from the original table that had 0's for all rows or a mix of 0's and 1's have been filtered out.

The columns remaining names thus represent a consensus signature e.g.

                  C1           C12                C23        C24         C51

R1              1              1                 1            1         1

R2               1              1                 1           1          1

R3               1              1                  1           1         1


Test table

                   C1         C2       C3         C4.........


R1               1            1         0          0

R2                1           1          1          1

R3                1           0          0           1

A second "test" table contains a superset of columns of the above table and a certain number of rows. It also has 1, 0

I want to filter those Rows from the second test table, where the columns names correspond with the train/consensus table and have the value 1. So in above example the output table would coontain R1 AND R2 provided other column names and values match. Columns absent in train/consensus tables are not used for matching.

Doing this with Reference row filter for each column in turn seems tedious! Thanks in advance!

I assume, you can use the Subset Matcher when translating your rows into itemsets of R1: C1, C2; R2: C1,C2,C3,C4; R3: C1,C4 and check them against your original (itemset) table.

Thanks gabriel,

I will try that. I could get a result by using the column combiner and reference row filter to combine the column with the string obtained by combining all columns.

But what you suggest could allow more detailed comparison.