Compare all values of all columns and if One value is repeated, the value in the new column created will be "1"

that’s what I have tried too but the expressions seems not the right one !!!

:’(
And how the names of columns will changes for each iteration with loop ( Row0,Row1,Row3,…) even I’m not using in this node the flow variable ?!!
@ipazin , Please help : (
Br.

As I said you should use column(0) and column(1)…
Ivan

heey @ipazin,

I could solve the probleme thank you !

now I have a problème in the Loop End, and the error is the console is :

may be the probleme is in the second rule Engine node that I configurate in that way :


Have I configurate the flow variable again in this Node ?!! How to do this please !
may you help be in this ? may be is the last question ! I don’t want to give up because it’s so urgent for me to resolve this !

thank you .

BR.

Have you used column filter node to leave only column result?
This error means you have different column types in different iterations…
Ivan

ooh yes sorry I forget it !!

In fact The loop End stop in the seconde iteration, and the error now is :

is that du to the seconde Role Engine Node because I have not configurate the flow variable in it , how to do that ?

Thank you VERY MUCH !

This error means you have different column types in different iterations…
Go iteration by iteration and see what is result from first iteration and what from second one.
Don’t think so it is due to second Rule engine.
Ivan

1 Like

Hi @ipazin,

Thank you very much for your help.and i’m very grateful for you !

In fact ,In the beginning I had 996233 rows , with transpose node the loop have to treat then 996233 column to get if there a repeated value. And that taked a lot of time (more than 2 days).
Is there any Solution for make that more fast, because I have to work a lot with this workflow ! and present the result in a report data.
Thank you again.
Bests Regards :blush:
I wish you a very good day !

1 Like

Hi @TIZIZ!

You are welcome :slight_smile: I’m glad you made it.

Some of these operations do last a bit longer. Not sure there is a faster way but more then two days is a bit long! Can you share a workflow for me to check? How many column do you have?

For execution time you have Timer Info node which will give you execution for each node so you will see node execution time so you can optimize.

Br,
Ivan

1 Like

Hii !
My data are Confidentiel !.
Thank you !

Br.

Hi!

Then don’t give me your data. Just export workflow and I will add some random data to it. Here is how to export workflow:

https://www.knime.com/knime-introductory-course/chapter1/import-export-workflows

How many column you have?

Br,
Ivan

1 Like

Ok !
I have 99623 Column.

this a link to get my workflow !

https://framadrop.org/r/8S2vnsCQEl#9vz6JsZ1H6wlrJHhLaeTLvOfaPpROkUCWFKdmZlH0cw=
Please Tell me when you download it !
Thank youu :wink: !

I mean how many columns the original table has? It has 996233 rows and columns?
There is no more workflow…
Ivan

https://framadrop.org/r/XtEk-jzaAZ#0ql6U02dRDdap+HaAXLzTa7Zt0r/bfJzbk11VzbVkiM=
here is the right link
Yes I have 99623 column and 231 rows after the transpose node.

Br.

Hi!

I got it. Still don’t understand why it takes so much time. I tried with 1000 rows and 30 columns and it was around 2 minutes so not sure. Which part takes the longest? Use Timer Info node

Couple of observations:

  1. In Missing Value node you don’t do anything - you should remove row with missing value
  2. In GroupBy node uncheck missing
  3. Second Rule engine node seems that you logic is opposite of what you have written before. If you result column is 0 then you write TRUE and actually if result is zero that means you do not have any repeated values.
  4. Loop end - you should uncheck Add iteration column - you do not need it

Br,
Ivan

Hii !

1 - in the Missing value node I do that !

, that’s mean remove row were the value is missing not ?!
2- I Just did it (thank you)
3- yes I had to do the opposite
4- I just did it now :wink: ( I want after filter juste where the value is false) to do the report data so is the same logic :wink:
5- for the Column appender , is that the right configuration ?
capp4
6- FOR the timer info node , have I LINK it to the loop end ?, when it not in the exécution ?
Thank’s again :):smiley:

Br.

  1. I did it on the default page based on data type. Do you have only strings?
  2. You can use “Identical row keys…”
  3. I usually link it to the last node - this would be Column Appender in your case.

Br,
Ivan

1-Yes I have only strings.
3- I should wait that the loop finish his execution to link the appender culomn with the timer info node ?!
here I wait the end of the execution( but It take a lot of time to finish the 99623 column :(.)
I try to link it to the timer info node and does not work !
I exécute it without link it and it this is the result :

PS : probabely I will separat data and concatenate it after ( that will be make it faster or not ?!)
:slight_smile:

1 - then you can do it on default tab for every string data type remove row if missing. Otherwise you have to use flow variable which is ok as well if you wish.

not sure about separating and then concatenating to be faster. You can try it.

Here is a blog about optimazing knime workflow. Maybe you can find something.

https://www.knime.com/blog/optimizing-knime-workflows-for-performance

Br,
Ivan

1 Like

Hi @TIZIZ!

Do do a little math: you need to go into loop 99623 times. Lets say 1 loop execution (1 column) takes 1 second to execute. That means the whole loop takes 99623 seconds to execute. Divide it by 60 and it is 1660 minutes. Divide it by 60 again and you get that execution lasts more then 27 hours :smiley:

Here is a link to another forum thread about speeding up your workflow:

Lp,
Ivan

1 Like

Thank you very much ! I will give it a try and tell you the result :wink:

1 Like