speed/performance of KNIME with CSV files

Dear all,

I just started to use KNIME Desktop to analyze comma separated files. These files contain 1Hz parameter recordings of a system.

The file size is 26MB, containing 55 columns of which around half of the columns are of type "double", half of the type "integer" and a few "string". It contains 65.000 rows of these parameters.

I am using the "File Reader" Node, not the "CSV Reader" Node.

The only issue I have currently is the speed or performance of KNIME. The "Interactive Table" Node or "Line Plot" Node are too slow to efficiently work with them. It takes 5sec to sort one column ascending/descending. The LinePlot takes 10sec to open if 2,500 rows are to be displayed. It does not react anymore if I want to show all 65,000 rows. MS Excel or LibreOffice Calc manage this file a lot faster.

What am I doing wrong? I played with the "Keep all in Memory" option, but it does not give a difference. I tried to run KNIME on Windows and Linux which both show the same issue.

Thank you very much!

 

I don't think you are doing anything wrong, but it has something to do with the performance of Interactive Table and line plot nodes.  

If a chart is needed, is it possible to reduce the number of rows by taking a moving average and sample at regular interval?  It's not an ideal solution, but at least you can get some interactive visualization without resorting to external software (e.g., via Spotfire/Spotfire Node).

Hello shinwachi,

thanks for your reply, and sorry for my late reply!

The Line Plot node is usable below 10,000 data points. Above that amount of data points, the R View nodes and the JFreeChart nodes of the corresponding KNIME extensions work fast and are user-friendly and configurable. This is what I was looking for. Of course, as you suggest, it is possible to use the Time Extract Node and the Moving Average Node to reduce the data points to below 10,000 to use the Line Plot Node without performance issues.

Thanks for your help!