Create BufferedDataTable

Hi to all!

Please forgive me for possibly asking a stupid question:
How do you create a new BufferedDataTable for the output of the execution method?
I already figured out that you need a Extension table first which needs DataTableSpecs which need DataColumnSpecs; or similar at least.

Could anybody please provide some code snippet (or at least redirect me to such) for creating an empty table and afterwards filling it with values and at last converting it into a BufferedDataTable for the output.
That would be more than great.

Thanks a lot in advance,
Cheers, Fabian

Here's a dump of some example code that may be useful

	/**
	 * {@inheritDoc}
	 */
	@Override
	protected BufferedDataTable[] execute(final BufferedDataTable[] inData, final ExecutionContext exec)
			throws Exception
	{
		BufferedDataContainer container = exec.createDataContainer(getSpec());

               // Do your stuff

               addRow(container, getListCell(sourceTypes), "Source types");

		// Do more stuff?

                container.close();
		
		return new BufferedDataTable[] { container.getTable() };
	}


	private DataTableSpec getSpec()
	{
		DataTableSpecCreator crator = new DataTableSpecCreator();
		crator.addColumns(new DataColumnSpecCreator("Key", StringCellFactory.TYPE).createSpec());
		crator.addColumns(
				new DataColumnSpecCreator("List", ListCell.getCollectionType(StringCellFactory.TYPE)).createSpec());

		return crator.createSpec();
	}

	private void addRow(BufferedDataContainer container, DataCell listCell, String key)
	{
		container.addRowToTable(
				new DefaultRow(new RowKey(key), new DataCell[] { StringCellFactory.create(key), listCell }));
	}



		




 

1 Like

Also, if you only need the contents of the current row to calculate your new cells you should probably use a Column rearranger, eg as at https://community.knime.org/svn/nodes4knime/trunk/com.vernalis/com.vernalis.knime.chem.pmi/src/com/vernalis/knime/chem/pmi/nodes/rdkit/abstrct/AbstractVerRDKitRearrangerNodeModel.java where the following is stripped down from

@Override
protected BufferedDataTable[] execute(BufferedDataTable[] inData, ExecutionContext exec)
		throws Exception {
	return new BufferedDataTable[] { exec.createColumnRearrangeTable(inData[0],
			createColumnRearranger(inData[0].getDataTableSpec()), exec) };
}

/**
 * This method handles the creation of a suitable column rearranger,
 * including optional replace input column settings
 * 
 * @param inSpec
 *            The incoming spec
 * @return The resulting column rearranger
 * @throws Exception
 */
protected ColumnRearranger createColumnRearranger(DataTableSpec inSpec) throws Exception {
	ColumnRearranger rearranger = new ColumnRearranger(inSpec);

	// Now generate the new column specs
	DataColumnSpec[] newColSpec = createNewColumnSpecs(inSpec);

	//Create a CellFactory - AbstractCellFactory or SingleCellFactory
	AbstractCellFactory cellFact = new AbstractCellFactory(true, newColSpec) {

		@Override
		public DataCell[] getCells(DataRow row) {
			//create your new cell(s)
		}
	};
	rearranger.append(cellFact);
	}
	return rearranger;
}

 

The advantages of this are that you can parallelise (the 'true' argument in teh AbstractCellFactory constructor), and writing to disk is minimised which speeds things up too.  The column rearranger also has mehtods to remove/replace columns

Steve

Couple of further comments.

If you use the ColumnRearranger method, then you can also use this to get your DataTableSpec in the configure method:

return new DataTableSpec[] { createColumnRearranger(inSpecs[0]).createSpec() };

If you use Sam's method, then you can iterate over the rows in the incoming table, using e.g.

for (DataRow row : inTable){
    //Do stuff with row
}

Also, if you are using Sam's method, you should call exec.checkCancelled() regularly otherwise you wont be able to cancel the node execution, and you can also use exec.setProgress() methods to set the progress and status message of the node (e.g. 5 rows of 50000 processed etc).  The ColumnRearranger method takes care of these for you

Steve

Thanks a lot for the code... it was very helpful.

What exactly does the getListCell method do?
Does it just query the table and returns the column with the name "key" and the type "sourcetype"?

If you could post a snippet it would be most helpful...

Cheers, Fabian ;)

Sorry I just copied some code out of the node I happened to be working on at the time wich doesn't have an input table so not the most helpful. 

Just for clarity in my snippet the getListCell method was a method I wrote to convert a List<String> object into a ListCell. This list was created without needing an input table. 

Steve's comments will be far more helpful to you and indeed where possible try use the CollumnRearranger approach. This will make it easier to make your node's streamable too. 

Have you tried creating a new node and leaving the include sample source code (something like this) included? This should give you a basic functioning node I think. 

 

Cheers

Sam

Steves' approach is quite good. But the problem at hand is the following:

I am programming an importer for a specific excel File (I already tried to import it with the normal excel node and then changing it, but that didn't work out well).

As a matter of fact, I need to create a whole new Buffered Table (so without having another table being passed to the method like in Steves' approach). I think that your (Sams') approach is better suited for my needs (correct me if I am wrong...).
I know that I have to create a new Table with the specs, but could you please tell me, how I add new rows to that table and how I fill specific cells of such a table?

That would be great...

Cheers, Fabian

Then yes you want to take an approach like mine. 

 

In your execute method you need to create a BufferedDataContainer, this is used to add your rows to. At the end you close the container and then call getTable(). These steps you should be able to see in my snippet above. 

 

    /**
     * {@inheritDoc}
     */
    @Override
    protected BufferedDataTable[] execute(final BufferedDataTable[] inData, final ExecutionContext exec)
            throws Exception
    {
        BufferedDataContainer container = exec.createDataContainer(getSpec());
 
       // This is where you put your code to create your rows

        container.close();
         
        return new BufferedDataTable[] { container.getTable() };
    }

 

To create the container you need to define the table specification. As you are making a new table from scratch you can take an approach like:

    private DataTableSpec getSpec()
    {
        DataTableSpecCreator creator = new DataTableSpecCreator();
        
       // Ad a column with the column name "Key" and set the column type to be a StringCell.
       // The factory methods IntCelLFactory, DoubleCellFactory, StringCellFactory are safer to use
       // as it will give you the prefered implementation. 
       creator.addColumns(new DataColumnSpecCreator("Key", StringCellFactory.TYPE).createSpec());
       
        creator.addColumns(
                new DataColumnSpecCreator("List", ListCell.getCollectionType(StringCellFactory.TYPE)).createSpec());
 
        return creator.createSpec();
    }

 

So the step you are missing is how do you create the rows? This method may give you some ideas of what you can do

 

I'm creating one row per factory. All the cells in the row are Strings. 

 

	private void addRows(BufferedDataContainer containerFactory,
			Collection<DataCellToJavaConverterFactory<?, ?>> factories)
	{

		int track = 0;
		for (DataCellToJavaConverterFactory<?, ?> factory : factories)
		{
			DataCell[] cells = new DataCell[4];

			cells[0] = StringCellFactory.create(factory.getIdentifier());
			cells[1] = StringCellFactory.create(factory.getName());
			cells[2] = StringCellFactory.create(factory.getSourceType().toString());
			cells[3] = StringCellFactory.create(factory.getDestinationType().toString());

			containerFactory.addRowToTable(new DefaultRow(new RowKey("Row" + track), cells));
			track++;
		}

	}

 

Use BufferedDataContainer#addrowToTable to add a row. There's multiple implementations of Row. If you are creating from scratch you can sue DefaultRow. This needa a RowKey (RowID, must be unique) and an array of cells. 

You can see here the use of StringCellFactory to get a string cell from a Java String. The same approach works for Int, Double, Long etc.

Making collection cells is a bit more complicated. 

Does the normal excel importer fail? 

Cheers

Sam

 

1 Like

Thanks a lot for that input…
This helps very much :slight_smile:

No, but I have got an excel that needs to be imported repeatedly with the same format.
The format is a little complicated and it is quite complicated to manipulate it once it is imported.
That’s why I am converting an importer I already have to a Node…

Cheers, Fabian