Hi I'm new in using KNIME and analysing databases,
I'm trying to analyse some PCR in a relational Database and have a problem with finding the right nodes for this problem. In particular i have a dabase which consists of two columns which are numeric and one column which relates the two numerics to eachother and indicates that one numeric is the either the parent or the child of an other numeric.
The task now is to find out whether the indicated relationship is symmetric. That is, to find out when i.e. numeric 1 is indicated as the parent of numeric5, if numeric5 is also indicated as the child of numeric1.
Could someone please give me a hint how i might solve this problem in KNIME? That would be very kind.
you can simply use a Joiner for this task. Connect your table to both inputs, Join on both columns "crossed", and you get all symmetric pairs. You can now further filter the included columns directly in the joiner, filter the non-symmetric pairs from the original table with a Reference Row Filter, get each pair only once with a Row Filter...
thank you for your hint. It was already very usefull but I'm a little bit stuck now.
I tried to implement your suggestion, but it only gives me the symmetries of my numerics. But I would like to find out whether my strings are symmetric.
I have the following kind of datastructure:
MasterID
RelationType
RelationID
4416786
Child
4416782
4416786
Child
4416785
4416782
Parent
4416786
4416785
Parent
4416786
After using the Joiner I get something like this (I combined all rows which have matching MasterID and RelationID):
rowID
MasterID
RelationType
RelationID
RelationType to MasterID
row1_row3
4416786
Child
4416782
Parent
row2_row4
4416786
Child
4416782
Parent
row3_row1
4416782
Parent
4416786
Child
row4_row2
4416782
Parent
4416786
Child
So this is pretty much that what I want to know and this result would be sufficient if my database were only 4 rows long. Now I can see if the two numerics refer to each other in a PCR. But the problem is that my database consists of more than 3000 rows. So analysing it in this manner would take to long.
I would like to know if there is a row combination in the resulting table which shows a wrong PCR, such as the row combination row4_row2 in the following table:
rowID
MasterId
RelationType
RelationId
RelationType to MasterID
row1_row3
4416786
Child
4416782
Parent
row2_row4
4416786
Child
4416782
Parent
row3_row1
4416782
Parent
4416786
Child
row4_row2
4416782
Parent
4416786
Parent
Or a wrong PCR of the types as shown in the last row of the following two tables:
rowID
MasterId
RelationType
RelationId
RelationType
row1_row3
4416786
Child
4416782
Parent
row2_row4
4416786
Child
4416782
Parent
row3_row1
4416782
Parent
4416786
Child
row4_row2
4416782
Child
4416786
Parent
rowID
MasterId
RelationType
RelationId
RelationType
row1_row3
4416786
Child
4416782
Parent
row2_row4
4416786
Child
4416782
Parent
row3_row1
4416782
Parent
4416786
Child
row4_row2
4416782
Child
4416786
Child
How can I analyse that in Knime?
Or maybe I'm on the wrong track here Marlin? Did I misinterpreted your suggestion? I don't get the next step. I couldn't really find you how I can filter the non-symmetric PCR with the Reference Row Filter or the Row Filter.
Maybe someone can help me with that. I would appreciate it.
I tried your suggestion. After getting all symmetric pairs of my database, I want now to filter those symmetric inputs whose relationship is wrong. That is, I would like to get all symmetric pairs whose relationship should be parent child but isn't. After getting this I would like to get some statistics about the quantity. Which nodes could I use for this task?
that depends on your definition of "wrong". If your definition is algorithmic, you can filter with a Rule-based Row Filter or maybe even a simple Row Filter. If you have a table of "good" relationships, you can e.g. use a Reference Row Filter or another Joiner.
Regarding quantities, you can use an Extract Table Dimension node to get the row counts and then proceed from there. But most nodes with "code" (rule-based nodes, Math Formula, Snippets) also offer access to the row count directly.