Using Knime as a Worksheet with Java Snippet

Hi,

I have the following table and I'd like to know how to do the following calculation using Java Snippet (without using the Lag Column).

 

For GroupName = "A":

[SumOfSquares(Row1) - SumOfSquares(Row2)]  *  [DF(Row1)+DF(Row3)]   =    [75.79 - 7.2] * [3 + 83]

 

And then the same for GroupName = "C":

[SumOfSquares(Row4) - SumOfSquares(Row5)]  *  [DF(Row4)+DF(Row6)]   =    [81.168 - 1.83] * [3 + 83]

 

And then the same for GroupName = "D":

[SumOfSquares(Row7) - SumOfSquares(Row8)]  *  [DF(Row7)+DF(Row9)]   =    [82.12 - 0.8786] * [3 + 83]

 

GroupName

		<p>B</p>
		</td>
		<td>
		<p><strong>Source</strong></p>

		<p>Between Groups</p>
		</td>
		<td>
		<p><strong>SumOfSquares</strong></p>

		<p>75.79641576511007</p>
		</td>
		<td>
		<p><strong>DF</strong></p>

		<p>3</p>
		</td>
	</tr>
	<tr>
		<td>B</td>
		<td>Within Groups</td>
		<td>7.203584234889962</td>
		<td>80</td>
	</tr>
	<tr>
		<td>B</td>
		<td>Total</td>
		<td>83.00000000000003</td>
		<td>83</td>
	</tr>
	<tr>
		<td>C</td>
		<td>Between Groups</td>
		<td>81.16829007987391</td>
		<td>3</td>
	</tr>
	<tr>
		<td>C</td>
		<td>Within Groups</td>
		<td>1.8317099173546845</td>
		<td>80</td>
	</tr>
	<tr>
		<td>C</td>
		<td>Total</td>
		<td>82.9999999972286</td>
		<td>83</td>
	</tr>
	<tr>
		<td>D</td>
		<td>Between Groups</td>
		<td>82.12135930420483</td>
		<td>3</td>
	</tr>
	<tr>
		<td>D</td>
		<td>Within Groups</td>
		<td>0.87864069622113</td>
		<td>80</td>
	</tr>
	<tr>
		<td>D</td>
		<td>Total</td>
		<td>83.00000000042596</td>
		<td>83</td>
	</tr>
</tbody>

 

Please, could you help me?

Attached you can find the workflow. Thanks in advance!

Hi iiiaaa,

why do you want to use a Snippet? This can be done with standard non-java nodes alone. You could e.g. use a Pivot followed by a Math Formula, or you could delete unused  values with e.g. a Rule Engine and then follow up with a GroupBy and again a Math Formula.

Is there a special requirement? Performance, maybe? Also, why don't you like the Lag Column? It's not necessary here, but is there something "bad" about in in general?

If you really really want to, here's a possible way. The value is always written in the "Total" row. Just make sure df is interpreted as a Double. Also, this snippet assumes the groups are exactly in this order and actually grouped, so be careful if you ever want to adapt this to something other than the output of ANOVA.

// Your custom variables:
double sq = 0, df = 0;
//---------------------------------
// expression start
// Enter your code here:
out_Value = null;
if (c_Source.equals("Between Groups")) {
  sq += c_SumofSquares;
  df += c_df;
  return;
}
if (c_Source.equals("Within Groups")) {
  sq -= c_SumofSquares;
  df += c_df;
  return;
}
// Total: write value, start new group
out_Value = sq * df;
sq = 0; df = 0;

Hi Marlin, 

first of all thank you very much for your answer. If you don't mind, I have 3 questions (the third one is the most important for me):

 

1) The formula for  GroupName = "A" (and for the other group)  is

 [SumOfSquares(Row1) - SumOfSquares(Row2)]  *  [DF(Row1)+DF(Row3)]   =    [75.79 - 7.2] * [3 + 83]

and not:

 [SumOfSquares(Row1) - SumOfSquares(Row2)]  *  [DF(Row1)+DF(Row2)]   =    [75.79 - 7.2] * [3 + 80]

I tried to modify the java code in this way but it doesn’t work (attached you can find the workflow):

// Your custom variables:
double sq = 0, df = 0;

// expression start
  public void snippet() throws TypeException, ColumnException, Abort {
// Enter your code here:

out_Value = null;
if (c_Source.equals("Between Groups")) {
  sq += c_SumofSquares;
  df += c_df;
  return;
}
if (c_Source.equals("Within Groups")) {
  sq -= c_SumofSquares;
  return;
}


if (c_Source.equals("Total")) {
   df += c_df;
  return;
}



// Total: write value, start new group
out_Value = sq * df;
sq = 0; df = 0;

What am I doing wrong?

 

2) I'm sorry but I don't understand how to do this calculation using Pivot and math formula or using Groupby. Please, could you be more specific? 

 

3) Actually I do not need to do it with an ANOVA table.

For example, starting from a table (not an ANOVA), I need to know how we use the cells to make calculations like in an excel spreadsheet; in our case   (C01 - C02) * (D01+D03).

Using Lag columns it becomes complex if the formula is longer. That's why I’d like to know how to use JAVA Snippet (or another node)  to select a specific cell and use this value to do any calculation. Please, do you know if there is a way to use Java Snippet to select a specific cell and use this value to do any calculation?  

For example is there a way to do this calculation with cells [(C01 + C02) * (D01+D03)]  like in excel using Java Snippet?

Thanks you very much in advance!

Oups, sorry for the mistake. Just loose the last return and you should be good. And you can drop the surrounding if too, if you want. (And make sure the first lines are in the upper "box" of the snippet.)

There is no general way to select one cell. What you can do is filter rows like we did here with the if's. The most extreme case would be to filter single rows based on their row id. Also, the snippet works from top to bottom, so any vertical reference has to be made with global variables and works only in that direction. The tables might look like excel tables, but that's only because of the similar presentation, the underlying model is very different.

Iiiaaa, I adapted your workflow to include the options I mentioned. The Pivot method seems the most natural, with the actual formula apearing in the Math Formula. The GroupBy method on the other hand is neither beautiful nor efficient, but it shows how the task can be separated into baby steps. (And also some tricks along the way.)

In a more abstract sense, once we understand what the formula "means", we can replace vertical references. The formula here is basically a computation per group, so we can look for nodes with group-wise computations, like GroupBy, Pivot, or Group Loop Start. From there, it's only a question of how to exploit this node. Maybe that's some general insight, that to use Knime effectively, you have to think much more about the actual meaning of a formula than with Excel.

Hi Marlin, 

Congratulations and thank you very much for your help! I really appreciated your instructions and I think from your workflow it is possible to learn many things.

I have this table and I want to calculate the average of "Mesure" column every 30 seconds for each "NPenon" and I have to take into consideration the datetime :when  datetime are equal.

I tried with Java snippet node  but it's not correct

I must do the operation on several stages

I do not find a solution. Please Help!!