My Challenge - writing an excel formula into an XLSX file... and as it turns out, more besides

Recently there have been a few forum posts that got me looking at the version of the java Apache POI which ships in KNIME AP 4.7, and as a result set myself the challenge of writing a component that could insert an Excel formula into a cell in an existing spreadsheet.

Well, having had a play, and then finding it could do much more besides, I have the following workflow as a proof of concept. I won’t go into the full details here, but instead I’ll just post the workflow, plus some screenshots which if you use KNIME 4.7 and you are interested in writing ever-more fancy spreadsheets directly out of KNIME, I hope will make you as excited as it made me :slight_smile:

OK, this is the demo workflow, with a couple of components that currently reside in my private area on the hub, so they currently won’t update, but hopefully will still work for you here. As it is still work in progress, I don’t want to put the components publicly on the HUB just yet!

The sample data is a data table:

image

Note that the value column has been intentionally left blank. This data is written to Excel, and is pretty boring:
image

Moving on…

The lower table creator looks like this:

And this is going to ultimately tell the “Write Excel Formula” component to do some stuff. But it is written using a syntax (for want of a better term) that is going to be interpreted by an intermediate component called the “Excel Cell Formula Expander”, because this needs to be generic, and we don’t know how many rows of data and such like our data is going to have.

The Excel Cell Formula Expander takes the lower control table, and works out, based on the data table that was written to Excel, what the actual Excel cell references will be for all the things we want to do, and it generates this table that will be passed to the Write Excel Cell Formula"

Hopefully you can get an idea of what this all might mean but as a quick explanation.

It says "Into cell D2 write the formula B2*C2, and format it using the Excel format mask 0,000.00

Do the same for cells D3 to D5

Into Cell D6 write the words “TOTAL SALES”, make it right justified with a font size of 20, and yellow writing on black background.

Make the headings A1:D1 black on yellow

Make the cells A2 to A5 (the data rows for product) Green

Put the formula SUM(D2:D5) into cell D7 and make that 14pt, right justified, with the format mask 0,000 and red on black.

Into Cell A9 write the label “–=== end of report ===–”

oh, and in a couple of places it sets a column width too.

In summary, yes it is going to look a right mess! :wink: but it will demonstrate a point, … if it works…

Maybe it will look like this!

Revealing the cell formulas, you can see that it really has written the formulas to the cells. No data calculations have been performed in KNIME:

Proof of concept - Excel Cell Formula writing -takbb.knwf (188.2 KB)

I would be interested in hearing your views, and of course further suggestions. As this is built using components (with a heavy reliance on java snippets!) rather than bespoke nodes, there are limitations on what can be achieved, but I think this has potential for being useful.

I have a technical question though. The components use Java snippet/Java Edit Variable to call the specific Apache POI packages that are bundled in KNIME 4.7. What will happen with future KNIME AP releases? If a different version of the Apache POI packages are bundled, will this component cease to work, or will it continue to have access to the jar files that were available when it was created?

Well hopefully a few of you might give this a try… let me know what you think.

Is this going to change your life? :rofl:

Maybe this might give some ideas for having similar functionality baked into KNIME.

btw, I’d like to thank my best virtual friend chatGPT for massively accelerating the creation of the java code to make this work. :wink:

5 Likes

You can also use the Write to Excel Template as a way to insert multiple excel formulas into multiple different cell positions. Basically you just read in the Excel sheet, make everything missing values, enter your formulas into the correct positions (make sure to turn on the setting saying that cells beginning with the equals sign will be treated as formulas, and make sure clear existing values isn’t checked), then write it back in as a table over top of the excel data.

I don’t have KNIME access right now to upload an example, but I am fairly sure that is how I am doing it currently.

I will check your component out when I get back in the office in the next few days. Very cool as always @takbb !

1 Like

Thanks @iCFO ,

I confess I hadn’t spotted that option in the Write to Excel Template, which sounds very useful and I’ll take a look at that too.

My aim here was a slightly different use case for putting a formula into a new xlsx which had only just been written by the workflow, and using no additional pre-built excel files. In fact it started with me wanting to place one-off pieces of text or formulas into random cell locations… but still being able to make their positions relative to the rows of data that have been written (such as the row after the last data row).

Then I got a bit carried away :wink:

I have uploaded a separate workflow to the hub, with a different (slightly simpler) single-cell component which is similar I guess to the Excel Cell Updater. This doesn’t have any of the formatting etc, and is closer to what I originally envisaged.

For example, using it in a loop to write a formula utilising a flow variable

or placing a label into a cell in column E, relative to the last row of data written:
image

and it does allow more straightforward updating too:
image

1 Like

You can actually write a new excel file via workflow, then feed that new file in as the “template” later in the workflow via flow variable. In fact, since you have written the table to the excel file within that same workflow you would already have the table structure to use with the above method. I regularly use the template node downstream even when there was no pre-existing “template” prior to creating a file within that workflow. You can even add the formulas in context of seeing the table with data and then replace any cell not starting with an equals sign with a missing value.

That being said, I love options and alternate methods. I would likely find your approach easier and faster in some situations.

2 Likes

Hi @takbb,

welcome on the POI path of life :laughing:

From my past experience I would say you will have to change every few releases. Not sure if it is still the case.
However as a workaround you can just put a specific poi version yourself into the node.
Well or change the reference if the bundle updates :slightly_smiling_face:

Thanks @AnotherFraudUser , yes I can see from watching your experiences that it can be “fun”. I certainly appreciate the effort you’ve put in to providing the Excel functionality in your nodes.

With the node development that you are doing, I guess bundling a specific jar is a reasonable way to go, if you need to.

As I’m currently just doing components rather than bespoke nodes, I looked at including jars with the component but realised that you cannot easily bundle a different jar to those that are included with the KNIME release (additional bundles) with a component because unlike a workflow (where you could drop the jar in the data folder) the component doesn’t actually have its own data folder (or any other folders). Only when I shipped it to my private hub space and tried to test it on another machine did it occur to me that the third party jar wasn’t available. Within the java snippet on your own pc you can specify a third party jar, but when you come to ship the component itself, no folders go with it and, the jar is left behind :frowning:

There was a specific third party jar that I wanted to include because it offered some useful but relatively straightforward functionality, and I had convoluted ideas of making a special “libraries” workflow available that the user would need to have available on their system, and then the component would maybe reference the jar within a folder in that workflow(!) but that idea lasted as long as it took me to realise that (a) it was a horrible idea and (b) whatever I write would not be so special that anybody would bother to jump through those hoops! :wink:

Instead I got chatgpt to help me quickly write the missing functionality to include within the snippet so I didn’t need the additional jar, lol! Anyway its all good fun, and always a good learning experience, and if anything proves useful to anybody else then that’s a bonus!

1 Like

Hi @takbb,

oh never knew components did not have their own folder - though you might use the java nodes folder like normal.
Well one workaround I saw in the past was doing a get node before to download the (then javascript) library. But I guess not the best solution for something like POI :confused:
Maybe the bundles can be referenced with a pattern - but yeah maybe the issue with the naming changing is also not there anymore :man_shrugging:

1 Like

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.