Data rows lost from source data? Looking at the Output table

Hi, I’m looking for some assistance on what I guess could be pretty basic.

In Knime I’m reading in a collection of csv-files with data for a number of different years, 2008-2019. The method used is using a loop with url as flow variable. So it runs through all files existing in the file folder I specified locally on my computer.

I know I have data files from 2008-2019, but if I look at the Table result for that node, in the Spec Sheet it is showing data existing for 2009, 2010, 2011, 2013, 2019. Is this normal? Could there be a cap at 10 000 rows I am not aware of or so? The total rows count is shown as close to 800 000. (I don’t know how many it should be in total.)

In addition to the reading in of files, the year is extracted from the file names. As far as I can see in the model and general data, there is no reason for certain years of data to be omitted. But I have been wrong before. :slight_smile:

Thankful for help.

Hi @1up
Have you tried running the loop one step at the time? You can do this by resetting the start loop node, and execute the Cross Joiner node (so do not execute the Loop End yet!). It will start the loop, but wait after the Cross Joiner is finished.

If the files are in the order of the years this cross joiner nodes should show somethng of 2008 already.

If the files are not sorted on the year you can then right-click on the Loop end and choose Step Loop Execution. The first time the status of this Loop End node changes to “paused”. When you choose Step Loop Execution again, it will perform a second run of the complete loop. This gives you the opportunity again to see if the loop does what you want/expect.

If you go through this loop one step at the time, does it give a clue why years are missing?


Thank you, this helped. It appears I hadn’t reset fully the loop-section I guess. Stepping through, it jumped between files, not taking them in order. I used to have a smaller set of files in the same folder, then copied more files into it. So it probably was set to only include the few files placed in the source folder earlier.

