File Reader (delimiter is sometimes also used in text fields)

I have been struggling a long time to read fixed files which include a ‘|’ as tab delimiter, but also have a fixed width field length. Most files can be read easily with File Reader node, but sometimes the delimiter is also used in txt fields, which lead to read errors.  I have been able to read these files using recursive loops and also via Java Snippet, but both solutions are too slow to read large files (> 6 hours for one file).

I think that the best solution would be to use the regular Fixed Width File Reader node. This node does not include an ‘automated best guess’ configuration option and with > 100 fields configuring this node is a lot of work. Therefore, I am looking for a solution to set the configurations widths via ‘Flow Variables’. 

  • How can I push the required data for the configuration settings of the Fixed Width File Reader node e.g. Fieldname, Width, and Type,  via the Variable Port?

Kind regards,

Eddy 

Hi Eddy,

it is not possible at the moment to pass array flow variables to a KNIME node, hence you cannot make the Fixed Width File Reader node adapt to different cases in this way.

What you could do though is to read in each line of your txt file without considering any separator, then process each lines using String Manipulation, RegEx splits and rules to extract your fields. This would provide more flexibility with all special cases, like when your separator appears inside a string.

Hope this helps. Feel free to share here some examples of your file structure to get additional ideas on how to solve your use case.

Cheers,
Marco.

I am going to try this solution, I love your innovation!