I am struggling with the following task.
I have read a table from a PDF that contains a data line by line like this.
Legend number1 number2 percent
lalala 200 0 10,00 %
blabla 40 60 24,50 %
dumdum 30 300 22,22 %
something 70 5 99,00 %
My target is a table that contains all the legend info in the first column and then the remaining data in separate columns. Like
Legend | number1 | number2 | percent
lalala | 200 | 0 | 10,00 %
blabla | 40 | 60 | 24,50 %
dummy dumdum | 30 | 300 | 22,22 %
dada da something | 70 | 5 | 99,00 %
I manage to flag the lines with data with a regular expression and can also split the columns with a “Regex Split” node. With some SQL-experience, I am probably blind to an elegant way of giving the rows that belong together a unique key to then group the rows.
So my I idea was to loop over the table, join it with the next row of itself (skipping data-rows joined with the following row) and combine always one “legend only”-row with the following data-row. I found no way of feeding the processed table into the loop again until no “legend only”-row is left. (And I agree the approach is not elegant either.)
Of course, my goal is to find a way to give the rows that belong together a unique key. But there must also be a way to loop over a table.
Thanks in advance for hints for both ways of solving this problem!