Hi everyone, I have a few thousand EDIFACT files that I need to parse. For those who don’t know what they look like: they contain 3-character segments (like DTM for date and time, NAD for name and address, etc). The segments contain data separated by plus and colon (their position is defined by the segment) and they are delimited by a quote mark to separate from the next segment. The raw data looks like this: UNB+UNOC:3+CUSTOMER+SUPPLIER+220206:0255+000000416'UNH+1+DELFOR:D:04A:UN:GAVB10'BGM+241::6:ANY+000000416+9'DTM+137:20220206:102'DTM+2'NAD+BY+BUYER::92'NAD+SE+SELLER CODE::92++SELLER NAME+SELLER STREET+SELLER CITY++POSTCODE+COUNTRY'... After segmenting it for better readibility: [grafik] I would like to be able to extract a few segment names to column headers and extract part of the segment content as table rows. Like this: BGM DTM NAD_BY NAD_SE 000000416 20220206 BUYER CODE SELLER CODE I used the file reader to separate the segments into columns by using the quote as column delimiter, but I’m stuck now. How do I get the segment names as headers and retrieve dedicated parts of the segments as rows? Any help is highly appreciated. Thanks.

Parsing EDIFACT files

mlauber71 February 8, 2022, 12:32pm 7

@gentile I could try and use the Python package on it and see if I can find a working example.

About KNIME and Python. If you add Python to your KNIME set it will greatly enhance your capabilities:

It might be a bit of a challange first but once you have it set up it opens the world of Python for you