I've been trying to use Tanimoto distance with Neighborgrams without success. I've imported binary data and have a table with 835 columns of binary values and there are no other numerical columns in the table. If I set the Neighborgrams configuration to Tanimoto, I receive an error stating:
WARN Neighborgrams Configure failed: TANIMOTO: Invalid number of columns for distance calculation: 835
The icon for the Neighborgrams will not execute at this point. It does seem to work with Manhattan and Euclidean distances.
there is a node called "Universe Marker" in the same category as the Neighborgram node. You need to put that node in front of the neighborgram node and configure it to contain a universe with a single column (which contains the binary data).
The configuration dialog of the neighborgram node will then parse the input spec and identify this single-column universe.
That didn't seem to work.
I added the Universe Maker node in front of the Neighborgrams node, and within the configure dialog I have a "Default Universe" to which I can add my 835 binary columns. The Neighborgrams node still complains with the same Warning.
I tried converting the 835 columns to a Bit Vector (using the Bit Vector node), but the Neighborgrams node doesn't like the resulting Bit Vector column. In addition, converting to the Bit Vector loses my endpoint column.
That's interesting. First of all, you do need to create a single bit vector cell from your 835 columns. That single column makes up a universe (one column only!).
What does "endpoint column" mean? Does the resulting bit vector column only contain 834 positions?
My main file has columns as:
Observation Name, Bit1, Bit2, ... Bit835, Activity.
Activity is described as Low, Medium, or High and that is what I'm interested in modeling. I refer to those types of things as "endpoints." If I run the data through the BitVector node, the Bits get compressed into a new BitVector column, but all the others disappear. At that point I have no way to evaluate the clusters resulting in the Neighborgrams for purity as there are no class labels (Activity of Low, Med, or High).
I understand. This is certainly unexpected behavior. I'll check if that's still present on the development branch and open a bug if necessary.
As a workaround I suggest to join the output of the bit vector generator with the original input table and feed the output of the joiner node to the neighborgrams (after processing it with the universe marker, of course).
Two more thoughts.
(1) If you do not select the "replace column(s)" option in the bitvector generator, it will retain all input columns.
(2) the bit vector node uses all columns (even non-numeric ones) and reserve a bit position for each. So the resulting bit-vectors will have a length of 837 (2 for Observation Name and Activity).
The bottom line is:
- Use a column filter to remove both columns "Observation Name" and "Activity", use the bit vector generator to create bit vectors of the remaining 835 columns. Select the option "Replace Column(s)" in order to reduce the size of the output table.
- Join the result with the original input table and use the joined table for further analysis.
Hope, it helps. I've opened a bug report. Thanks for bringing it up!
Thank you for the quick replies!
That seems to have worked. The only addition I had to put in place was a Column Filter node after the join to strip out all the original bit containing columns (Bit1, Bit2, Bit3, ... Bit835).
Neighborgrams set to Tanimoto seems happy.
One more very minor bug that I found is that the Hot Keys for unhiliting from the Neighborgram do not work. Specifically, on the menu you can choose Hilite->Hilite Selected or just press 'h'. This works. Another option is Hilite->Unhilite All which should respond to 'Ctrl-Alt-h', but I can't seem to get that to happen.
Cool. The Neighborgram node has undergone major changes. We hope to be able to include in the next public release (if not 2.0 then 2.1 I guess).
One more very minor bug that I found is that the Hot Keys for unhiliting from the Neighborgram do not work.
Yepp. That's already fixed. Let's see if you find the other three remaining bugs :wink:
I have to use it EXTENSIVELY now. 8^)