Situation-related Column Filter based on a additional table

Hi,

I am facing a problem right now, for which someone maybe got an idea to help me :)

I have a table containing an articles title and some (up to 150; example 3) category rows (empty cells allowed):

Original Table
Title Category1 Category2 Category3
A X   Y
B   Y  
C Y XX YY
D YY X  
E X   XX
F YY   XX

The next table shows a reference table containing weighted values of each category.

Category Values
Category value
X 20
Y 10
XX 5
YY 1

Based on the category value I want to reduce the first (original) table to only show one category column containing the category with the highest value out of all in one row. Following the example it should look like this:

resulting table
Title 'Category
A X
B Y
C Y
D X
E X
F XX

If someone has an idea or a hint, I'd be glad to hear any suggestions. In the meantime I will work on that porblem. Once I found a good way, I will let you know.

Cheers,

Manu

Now I'm trying to follow following thread:

https://tech.knime.org/forum/knime-general/find-max-value-and-use-correspondingadjacent-cell

Before I can adopt this suggestion, first I need to do some kind of VLOOKUP using my value table and append the result to my original table (about another 150 columns).

From another thread (https://tech.knime.org/forum/knime-general/vlookup-excel-type-of-functionworkflow) I got the sense, that I could use the cell replacer node (either with append or replace column option), which should do the trick.

In this case I have the problem that I have to iterate through those original 150 columns, but I can only choose one particular column as "target column" within the cell replacer node. As suggested, I tryed to use the column list loop start node, but after two iterations it breaks down as it looks like its not working in combination with the cell replacer node.

Does anyone have any suggestion on that issue?

Thanks,

Manu

Alrighty,

I thought I found the correct soultion thanks to Gabriels last comment in this thread: https://tech.knime.org/forum/knime-general/recursive-loop

In order to force the File Reader being reset during the loop iterations, you need to connect the variable out-port from the Loop Start with the variable in-port of the File Reader. Those ports are only visible in the expert node and need to manually enabled using the node context menu.

 Applied to my specific problem it means to connect the variable out-port from the Column List Loop Start with the variable in-port of the Cell Replacer.

It looks like this solution won't work on the Cell Replacer as it still breaks down after the second iteration with the hint "No such Column" (first column to choose). After opening and applying the configurations (without changing anything) it automatically switchs to the next column which is parsed by the column list loop start as next column.

Any suggestions?

Thanks,

Manu

Hey Manu,

You need to configure the Cell Replacer with a Flow Variable to choose the right column. Otherwise it still tries to replace Category1 even if your iteration reached Category2.

Have a look at the wiki-page and on the example workflows on the EXAMPLE-server.

To get one column with the highest value there are several aggregation nodes to help you.

To get the name of this column is a little more difficult. It depends on your data: Are the weights unique? If yes, you can easily use the Cell Replacer with the Category and Value column exchanged. But if you have mutliple categories with the same weight, you would first need to define a strategy which column you would want to use, then we can think about a solution.

Best,
Ferry

Hi Ferry,

thank you for your help. You helped me to think a little further. I just had to set cell replacer node's correct flow variable inport "targetCol" to currentColumnName, which is passed on by the column list loop start node.

The weights are actually not unique at all. For now I haven't thought about a strategy to choose one category if the weight/value is equal. The category value list is actually a list containing about 288,000 categories and their weights/values are just the sum of how often the category appears in the original table (using the value counter node).

I will think about it and come back to that. For now you helped me a lot.

Cheers,

Manu