Row Selector

Computational chemists tend to make subjective decisions which molecules they select for subsequent processing which might include making analogues/derivatives and evaluating these as alternatives. To some extent this can be defended as an expert's compromise based on the consideration of various molecular properties and functional groups present. Most of the time, no molecule meets all the criteria ...

Existing computational chemistry software typically allows the user to select one or more rows in a spreadsheet or in a ComboBox. The "Row Filter" node is currently a less than ideal means of addressing this working practice.

Having spent a little time on this matter, I've come to the conclusion that there is a conceptual problem attempting to implement a "Row Selection" analogous to the current Column Selection class. An input data table may not be populated with data when a node is being configured. One could not allow configuration until the table is populated but I've not found any means of obtaining the table ID and then the table status, the list of row IDs etc. Am I correct concluding that the infrastructure doesn't allow Row Selection to be implemented?

A much more ambitious project would be to implement some spreadsheet functionality including sort, column order, select and save tools as an extension of the current interactive table view node. However, I don't have the time to pursue this at present.

Row selection through a dialog would be doable - although it would be weird. Since the column information is (most of the time) available when a node is being configured you can adjust this before you execute the node. In order to know which rows a table contains we would need to first execute the node, then have the dialog display the RowIDs and allow the user the move them around and finally, during a second execution, filter the rows accordingly.

However, for what you are referring to, it seems to me as if you could also make use of a functionality that allows you to hilite some rows in a table view and then have a subsequent node in the flow that uses this hiliting information for the filtering. Such a node will come out with our next release - if you want to have it earlier send us a message and we will forward a preliminary version of it...

Putting excel-like spreadsheet functionality in KNIME is not really at the core of our priorities. The goal of data pipelining is to model operations on data through the pipeline. What you are referring to is more an interactive, table based view that allows you to perform such operations directly but you can not apply them to another data set. If you model it through a pipeline, you can. Inforsense has an interesting "macro recoder" which records what you do in the spreadsheet and then creates the corresponding pipeline. But although I thought this was really neat, I think we will not put this in anytime soon, sorry...

- Michael

Thanks for the quick reponse.

It hadn't occurred to me to consider Hiliting - its something I have yet to explore as the implementation is different from what I had expected based on computational chemistry software.

Alot of computational chemistry is currently an interactive rather than workflow process - although the protocols used are often similar. One of my objectives is to represent such protocols as a Knime workflows without automating decisions which the chemist make subjectively such as - "Can I make this molecule?"

berthold wrote:
Such a node will come out with our next release - if you want to have it earlier send us a message and we will forward a preliminary version of it...

I have just noticed that the hilite filter node is already included in the latest available version of KNIME. You'll find it in "Data Manipulation" -> "Row" -> "HiLite Filter". The node description has some typos but other than that it's fully functional. 8) I'll fix the typos right away.