Column selector in Node

Hi there,

Just a quick question, I can't get the column selector to work in a node I am writing (basically a regex/string filter). Here is the node dialogue code. I need to filter the available columns for only string values. The problem is the StringValue.class bit.

Thanks in advance for your help,

Stanage.

import org.knime.core.node.defaultnodesettings.*;

/**

  • NodeDialog for the “RegexFilter” Node.

  • Regex String filter

  • This node dialog derives from {@link DefaultNodeSettingsPane} which allows

  • creation of a simple dialog with standard components. If you need a more

  • complex dialog please derive directly from

  • {@link org.knime.core.node.NodeDialogPane}.

  • @author stanage
    */
    public class RegexFilterNodeDialog extends DefaultNodeSettingsPane {

    RegexFilterNodeDialog(){
    createNewGroup(“Filter parameter:”);
    addDialogComponent(new DialogComponentString(
    new SettingsModelString(RegexFilterNodeModel.STR,null),"Enter Query String here: "));

     addDialogComponent(new DialogComponentColumnNameSelection(
             new SettingsModelString(RegexFilterNodeModel.COLSEL, ""),
             "Select a column", 0, true, StringValue.class));
     
     addDialogComponent(new DialogComponentStringSelection(new SettingsModelString(
     		RegexFilterNodeModel.STRSEL,null),"Use regular (regex) expression or STRING: ","REGEX","STRING"));
     
     closeCurrentGroup();
    

    }
    }

The code looks good. Is it possible that you missed to get the import statements right (top most lines of the file). I don't see the import for StringValue.class.

There is very convenient shortcut in eclipse. Ctl-Shift-O organizes the imports (same functionality is available in one of the menus).

Hope it helps.
Bernd

Hi Bernd,

Thanks for the pointer, I had in fact missed the correct import org.knime.core.data.StringValue, so it through the error (moral of the story don't code when tired!).

On another note, in the node I am designing, I want to split the output into two streams, dependent on whether string found or not, have you an example of some code which shows how this can be done?

Best regards,

Stanage.

A node that does that sort of split is the row partitioner. The node model class is called PartitionNodeModel (in eclipse use the menu "Navigate" and "Open Type..."), which extends an abstract sampler node model.

But to be honest: I wouldn't use or extend any of those classes as they are not meant to be subclassed. Instead you could use code as:

        DataTableSpec spec = data[0].getDataTableSpec();
        BufferedDataContainer positive = exec.createDataContainer(spec);
        BufferedDataContainer negative = exec.createDataContainer(spec);
        int someIndex = ?? // the cell index of interest
        int count = 0; // for progress information
        final double totalCount = data[0].getRowCount(); // floating point op.
        for (DataRow r : data[0]) {
            DataCell c = r.getCell(someIndex);
            if (!c.isMissing() && ((StringValue)c)
                    .getStringValue().matches(".*[rR]egexp$")) {
                positive.addRowToTable(r);
            } else {
                negative.addRowToTable(r);
            }
            exec.setProgress(count++ / totalCount, "Row " + count);
            exec.checkCanceled();
        }
        positive.close();
        negative.close();
        return new BufferedDataTable[]{positive.getTable(), negative.getTable()};

Hope you find it useful.

Regards
Bernd

PS: Nice to hear that you started node development. How many nodes have you written so far?

Hi Bernd,

Thanks for the example. I am, however, having a few problems (it has been a long while since I last wrote anything in java). I cannot see how I identify the port data[0] from your code example. Can you post a complete node model code set, so that I can see how it is done. I am also having problems trying to pass existing table data along the workflow, and a complete node model code would help. I can manage the node dialogues and the interaction with the node model, but I am having real trouble understanding how table data is passed through the node model, and a new column added.

As for other nodes, I have a few ideas, - definitely an Microsoft Excel file reader, a couple of simple text mining nodes to extract strings from parsed data (once I am familiar with the KNIME environment/JAVA again). I will post these nodes onto the NODES4KNIMES sourceforge site as soon as I have tested them.

Hope everything is great in Konstanz.

Stanage.

"data[0]" refers to the first input port, it is the array argument of the execute() method. If you used the node wizard, the parameter is called inData (I could have anticipated that, sorry). I'll send you a sample NodeModel file via Email.

Just as a side note regarding your node development plans. Parsing excel files is probably complicated (you rely on external libraries, for instance POI). I do know that one of the KNIME partners has requested such a node but we don't have resources to realize it in the near future (and it's of too low academic interest). I heard rumors that they plan to write this node themselves and contribute it to the community.

We also have text mining nodes in the works (as part of the groups research focus), though convenience nodes to extract strings from text etc. can always be useful (and shouldn't be that complicated). Sounds good.

Regards
Bernd

I missed to comment on your question on how to add a column to a table: If the implementation of your node only does a column based operation (adding, removing, reordering...), you should use a ColumnRearranger. The online API contains an example: http://www.knime.org/docs/api/org/knime/core/data/container/ColumnRearranger.html