format of input of entropy node

hello

I have a problem to calculate entropy for a specific column!

at the first I don't know how is the format of input of entropy node will be?

I create a CSV file with 3 columns ( Name , Ferquency , Probability) but there is always has entropy node an error!!! that sum is not 1 ... I'm sure that the sum of cloumn (probablity) is equal to 1 ...

pleas help me. what do I do?

my second qeustion is: I use value conter node to count my records. is there a node to calculate them probablity?? what is it? and how?

 

 

thanks alot...

Hello mah,

I think you are referring to the Entropy Uncertainty Scorer. This node calculates the entropy per row, not per column. This means you need your probabilities as columns, not as rows. I suggest to use the Transpose node.

To calculate the probability you can use a GroupBy node, group by the class (Name in your case) and aggregate on any other column using the method percent. If you add a second aggregation with the method count you get the same result as in your first question (Name, Frequency (aka count), Probability (aka percent)).

To learn more about the GroupBy node you can have a look at our Youtube-Channel:

Best,
Ferry