Addressing column rows w/Java snippets

Dear KNIMErs,

Me again. :-) After utilising a "dirty hack" to replace values with a table lookup otherwise not permitted by KNIME (why won't dictionary replace work with KNIME tables, by the way?), I'm somewhat stuck at my next processing step.

I'd like to create a running sum across a column, and I don't find the means with either the math function or the Java snippet node. In Java, I assume there is a way to address the column as an array of sorts, but I wouldn't know how. Any hints on that?

Thanks,
Ergonomist

Hi Ergonomist,

Ergonomist wrote:
why won't dictionary replace work with KNIME tables, by the way?

That would mean, that we had to completely replace the data storage part. The tables can be read and written sequentially, using dictionaries (hashed, maps, ...) requires a random access to the data on disk. Not impossible but much more complicated and nothing we will address in the near future.

Ergonomist wrote:

I'd like to create a running sum across a column, and I don't find the means with either the math function or the Java snippet node. In Java, I assume there is a way to address the column as an array of sorts, but I wouldn't know how. Any hints on that?

It won't work with these two nodes, as both of them work row-based and do not have "memory". It could work with the Python node(s) as at least one of them is capable of creating completely new tables. I haven't used it myself yet, though.

Regards,

Thorsten

Hi Thorsten,

Thanks for the prompt reply. Let me clarify, since I'm not quite sure I got across what I meant to:

thor wrote:

That would mean, that we had to completely replace the data storage part. The tables can be read and written sequentially, using dictionaries (hashed, maps, ...) requires a random access to the data on disk. Not impossible but much more complicated and nothing we will address in the near future.

Do you want to know my dirty hack? I'll explain:

Query result 1:
2000 2001-01 5
2002 2003-04 7
2000 2002-04 4
2001 2002-08 10
...

Query result 2:
2000 5932
2001 4623
2002 5461
...

What I want at that stage:

2000 5932 2001-01 5
2002 5461 2003-04 7
2000 5932 2002-04 4
2001 4623 2002-08 10
...

I get it done by:
- Parallely executing query 1 and 2
- Writing query 2 to a CSV "dictionary file" and joining it with query 1 (to sync the queues)
- undoing the join by filtering the result column
- converting years and amounts to text
- dict-replacing into new column with previously generated CSV
- converting back to numbers

It's hacky, but it works, even without Java. I just think it could probably work much better...

thor wrote:

It won't work with these two nodes, as both of them work row-based and do not have "memory". It could work with the Python node(s) as at least one of them is capable of creating completely new tables. I haven't used it myself yet, though.

Allow me to explain this better. What I get in a later part of the above-described queue is the following:

2000 2001-01 .00084
2000 2001-02 .00076
2000 2002-04 .00052
2000 2002-05 .00063
...

I'd like to convert it to:

2000 2001-01 .00084
2000 2001-02 .00160
2000 2002-04 .00212
2000 2002-05 .00275
...

That's row-based, I'd just need to address the rows somehow. Or is that impaired by what you express as "having no memory"? I don't think I'd need to create new tables at all...

Thanks,
E.

Ergonomist wrote:

Do you want to know my dirty hack? I'll explain:
...
What I want at that stage:

2000 5932 2001-01 5
2002 5461 2003-04 7
2000 5932 2002-04 4
2001 4623 2002-08 10


That sounds very much like the joiner, except that you do not have unique keys... Anyway, this can quite easily be done inside a custom node using a hash, if you know Java. Nevertheless amazing what one can do with KNIME ;)

Ergonomist wrote:

I'd like to convert it to:

2000 2001-01 .00084
2000 2001-02 .00160
2000 2002-04 .00212
2000 2002-05 .00275
...

That's row-based, I'd just need to address the rows somehow. Or is that impaired by what you express as "having no memory"? I don't think I'd need to create new tables at all...


I see. That should work with the Jython node.
I just talked to our scripting expert if we can add a field for static variables to the dialog of the scripting nodes (there will be more than just Java in 2.0). That should be easily doable and you would be able to access these static variables inside the "row function" and e.g. sum up a column and return the current value for each row.

Regards,

Thorsten

Hi Thorsten,

thor wrote:

That sounds very much like the joiner, except that you do not have unique keys... Anyway, this can quite easily be done inside a custom node using a hash, if you know Java. Nevertheless amazing what one can do with KNIME ;)

It sure is! ;-)

thor wrote:

I see. That should work with the Jython node.
I just talked to our scripting expert if we can add a field for static variables to the dialog of the scripting nodes (there will be more than just Java in 2.0). That should be easily doable and you would be able to access these static variables inside the "row function" and e.g. sum up a column and return the current value for each row.

Sounds great! Even though I already said that: Can't wait! :-D

A shamelessly selfish question: Will KNIME 2.0 still be packacked in a single ZIP on Windows? I have an informal agreement with our IT architect that I can run free software as long as IT security agrees (they did of course agree to KNIME), but I can't run installers myself, which makes him and our IT support the (tight) bottlenecks for installer-based packages...

Thanks,
E.

Ergonomist wrote:
A shamelessly selfish question: Will KNIME 2.0 still be packacked in a single ZIP on Windows?

Jep.

Yay!

I <3 KNIME. :-D