Just getting my feet wet. Thanks to all who have helped me so far responding to user questions. Thanks in advance to any who might be able to help me with what I think is an obvious issue.
I have used the Histogram node to, uh, make a histogram. When I pass my cursor over the view, I am rewarded with “0.26–.027 Frequency: 2086”. How do I access the values for the bin widths and frequency counts, preferably in tabular form? There doesn’t seem to be any output from the node other than the image.
welcome to the KNIME forum.
There is a Auto-Binner – KNIME Community Hub node, which does that.
In KNIME, there generally is a distinction between nodes manipulating the data (like the Auto-Binner) and nodes that display views.
So I have a dataset which I’m analysing through a 20-bin histogram. To turn that histogram to a frequency table, am I correct in the following?
Auto-Binner node to create a new column with values Bin 1 - Bin 20, inclusive.
Row Filter node, retain rows equal to “Bin 1”.
Extract Table Dimension node, yielding the number of occurrences = number of rows displayed in the first row, the number of columns in the second.
Repeat steps 2 and 3 19 more times.
Concatenate node to combine all 20 table dimensions.
Row Filter node to retain only the row dimensions, excluding all (the same) column dimensions.
Create a dictionary-ish file with the bin intervals entered presumably as strings.
Column Appender node to combine bin intervals with number of occurrences in each bin.
Any of the Writer nodes to output the table.
First of all, that’s a lot of work, and I’m just wondering whether it’s because I’m new at this game and I do everything the hard way.
Second, that seems crazy: The bin intervals and frequencies not only have to be generated in order to produce the histogram image, but they are readily available, as I’m able to see the bin intervals and frequencies when I roll the cursor over the chart. In Stata, for example, after the software does the work of generating a histogram (or performing a regression), it is possible to summon a “return list” of all the critical values.
When you say that nodes that display views are different that nodes manipulating the data, certainly display nodes are manipulating the data, just not making the manipulated data accessible.
nan, I am very grateful for your steering me to the Binner nodes, so thank you. And if you have any thoughts on my producing a frequency table, I’d love to hear them.
not sure I understand your goal correctly. I would use a GroupBy node after the Binner and group by the column containing Bin 1, Bin 2… with a manual aggregation to Count the rows. That should give you a table with two columns, the bins and the number of rows within each bin.
If you need the cutoff values, you can set the Auto-Binner to name bins by borders and extract the values from there.
Thanks so much for your reply. You’ve certainly pointed me in the right direction with the various Binner nodes. I’m sure I’m not using them in the most elegant way, but like I said, I’m new to this game.
One of the things I’ve learned in my long life is that one can learn from first impressions: The new observer doesn’t take things for granted, and sometimes can point out some bleedingly obvious items that seasoned observers have long discounted — situational blindness, as it were. So consider this as a completely naive observation:
When I use Auto-Binner, the intervals are what the program assigns them, like 9.23715594–11.1509832. I can force integer-bounded intervals, but that doesn’t help me when I’m dealing with data like ratios. So I use Numeric Binner, which gives me the intervals I want, but which is a pain in the neck if I’m setting up, say, 20 bins. By contrast, I go to Histogram, which lets me specify the number of bins, and also fix my upper and lower bounds. In other words, Histogram lets me create a large number of bins with exactly the bounds I want. Stick that feature into Binner and I think you have a winner!
Thanks again for your advice. Here’s hoping I’m able to apply it properly. Cheers.
Thanks for the feedback, very much appreciated.
We are currently in the process of updating nodes to more modern settings dialogs – you may have noted already that the settings of the Histogram look much different that those of the Auto-Binner. We plan to update the Binner nodes soon. One goal will be to align the options for binning in both, the Histogram and the Binner. Your feedback is reassuring that this is the right direction to go.