@armingrudd thanks a lot.
I don’t understand why transpose implementation is so heavy but now I want to find another solution.
Thanks for the answer!
@armingrudd thanks a lot.
If you want to use the Transpose node and the Reference Column Filter node, you can filter the first row of the table > transpose > row filter (filter a range of rows by number) > transpose > and finally use the reference column filter node. But why not filtering columns directly?
23302-1-3.knwf (35.6 KB)
Indeed, I was missing something. I blame lack of caffeine.
Here is my solution
Filter columns from n to m.knwf (18.2 KB)
glad to see this got you fired up but is there anyone with a feeling that writing regex, transposing table (twice!) and other interesting approaches to this fairly simple request is a bit too much?
Well, I’m curious if there is a more straightforward solution to this problem other than using a column filter and a simple regex (or even better: a dynamic regex!).
Looking forward to learning something new.
Thinking about Java development?
Of course using some code could be the solution, but I always think about pure KNIMEing… (if regex is counted!)
A few days not KNIMEing and then such a lot of answers! Thank you very much. All of them are (different) solutions. Cool, but I often wonder why such a simple task (a one-liner in most other languages, especially Mathematica), need such solutions. KNIME is mostly table based - such basic operations should be implemented directly, just my opinion.
@aworker: Tanks - works as it should and this “extract column header” avoids the really slow execution of the transpose node (I always wonder why is this simple operation so slow on larger tables - it is only re-indexing?)
Thank you @pigreco. works fine, I wrote in another reply: Still wondering about the slowness of the second transpose (about 45 seconds on a 6 Kernel MacMini - with my data 10000x250 table). The first transpose is fast enough.
@armingrudd: this is quite cool, to build the regex automatically - good for implementing in a parametrized workflow.
@HansS Ha! You are so right - didn’t see this. It is hard for my Mathematica-brain to “program” such simple tasks as a column selection.
@andrejz: Thanks to you, too. Also a quite cool solution - I learned something
Vielen Dank für Ihre freundliche Antwort.
The internal data structure of tables in KNIME has been optimized to work fast on column operations where the number of rows per column can be very high. This is why most operations in KNIME are column based and for instance operations between cells across columns and rows are often complicated. The way cells are linked each other in KNIME has to be completely re-structured when transposing a table. This is why it is slow at this task, merely because there is a lot of data structure reorganisation. This is not the case in Mathematica or in Matlab where the structure of numeric matrices almost match the organisation of data in RAM memory making the cell access and transposition of matrices very close to a low level memory or CPU operation. There are though tricks or shortcuts that can be sometimes applied to make matrix operations in KNIME faster without having to go for instance through a KNIME loop or java/Python/R node solution.
If you want to know more in detail on how KNIME handles table cells, one possibility is to have a look at how to create new KNIME nodes in java. How tables are precisely organised in KNIME can be understood from the java code :
Hope this sheds a bit of light.
Viel Glück und Grüße aus Toulouse
Vielen Dank - das ging ja schnell.
Thanks for this insights, which makes things clearer to me. I’m having a look at the implementation - but this is not my main interest. I’m using KNIME to do Machine Learning with my students and use it for me just to verify results I got with Mathematica (always better to have two (or more) sources). But the more one uses KNIME and one gets more familiar with it, it shows that this is really a valuable and mighty tool. Besides this it is quite easy to use, too - if you know theses tricks like in the answers here.
Herzliche Grüße aus Hanau
Indeed as in any language, knowing the tricks helps a lot . I fully agree with your comment about KNIME (specially for teaching) and having two or more sources. The nice extra thing of KNIME is its great community - people like @armingrudd always willing to help with really amazing tricks and workflows, as you have experienced here. KNIME summits in Berlin are also great to meet all these people !
I want to suggest another solution. With the Extract Table Spec node it is not necessary to do a Transpose first to select the column-names (saves a lot of time with a large dataset). You can directly make a selection of row (using Sample node or Rule Engine node or …).
Hi there everybody,
just to add info. Have added this functionality/request it into existing ticket (Internal reference: AP-14092) from this topic: Combined column cleaner/selecter tool [proposal]
@armingrudd this is what I had in mind
@HansS Thank you for this hint - looks good and is easier than before