I need to read the red framed values. But in this example its rows 124-133, and in some other file in the loop these values will be maybe rows 119-128, etc… Each file its somewhat different position.
Here is what knime Hub says about the node
@Dalmatino16 with the help of the Counter Generation you could give the lines numbers.
Then you could use the Rule engine to save the number of the line if a certain condition is met and store the line in a separate column (var_start, var_end). Then only keep those columns and use a math node to calculate the start and length. Then convert that into Flow Variables that you can then use to steer your File import.
A somewhat similar example:
You mabye could upload 3 or 4 examples with varying table starts that would represent your task. Then someone might have a look.
Thanks, I am going to try follow the instructions you provided. In the meantime, here are several examples.
All that I need to extract is 10 values from the column between “Median curve” and “End of results” at the end of the files.
@Dalmatino16 so it seems the structure of the tables are different and the 3 samples would not represent the whole task. You could try and use a part of the head as column name in order to identify the columns other than by position. I will have to see if I can build something.
Hi, thanks . By looking at the files, I cant see what is different, but then again, I have 140 files so cant check them all by eye.
Can one tell from the workflow or from the loop end output I got which files are problematic? It might be just one file out of 140.
Hi, attached find one file which doesnt work [Prosce_H.txt|attachment](upload://v3Wr9DyZyEJBQZ7NzJz4df3GBY4.txt) (7.2 KB) I cant figure out why. The structure looks the same as one of the files you worked on.
But here are the results of the last node that worked for the Prosce file
issue with Prošće is that in last row values are below 100 which means you have less digits leading to more columns as your separator in File Reader is space. To go around it you can add zeroes before every value that is under 100. But this is not really smart as this would require manual intervention each time there isn’t much rain or there is heavy rain and values go above 1000. So you can keep @mlauber71’s approach but change File Reader to read all data in single column (set separator to something that will never come up in data; e.g. * or something more complex if you want to be sure) and then manipulate it using String Manipulation node to get one space between each value. Follow it with Cell Splitter and you should be fine. And as a bonus you’ll get your values formatted properly
I have tried to work with your example and it worked for the most of the stations. However with some there are still problems. One problematic is the file called Djurmanec_H. It has a missing value
so that causes Knime to move the row left. The result looks like this
Its less of the problem because there is only one such file out of 100. But I have a whole set of files which dont have such a case but there is still something thats bothering the script/workflow and it doesnt work on them. I attached one such file. I am going to try figure out what might be the problem, but I am afraid I am not that skillful yet so would appreciate the info if you can quickly see it. Bracak_Q.txt (7.3 KB)
sry for later response. Don’t know have you managed to solve it already.
Anyways I have checked it out and problem with Bracak or difference versus other files is Percent Chance Exceedance column. Seems some files have comma in this column and and some dot. You can solve it by putting String to Number column after Cell Splitter node and configure it to always take Column0_Arr column. Check workflow attached. kn_forum_43962_file_reader_complex_format_ipazin.knwf (77.2 KB)
Note that if your input files are not consistent and format can change with time/person these kind of problems might come on a regular basis.
Ivane bok !
Sorry for my late response now, I didn`t have an access to my account. But yes, you have resloved all the issues for me with the latest workflow. Thanks a lot. I am aware that change of the format can cause an issue. I just have a problem noticing whats different between the formats, sometimes the change is subtle. Any software to suggest which can screen/compare formats?