Method To Extract Text From Row And Append To Previous Row?

On occasion I need to export a file from SAP, with a fixed line length of 256 characters. Unfortunately, each line written to the file is always longer than 256 characters, causing text overflow into the next line. Something like this:

=====
Row 1: FIELD 1 FIELD 2 FIELD 3 FIELD 4 FIEL
Row 2: D 5
Row 3: Val1 Val2 Val3 Val4 Val5
Row 4: 5
Row 5: Val1a Val2b Val3c Val4d Val5b
Row 6: 5b

Is there a way to have Knime extract the value in every other row and append the value to the end of the previous row, then eliminate each empty row? Output would take the above and transform it to something like this:

=====
Row 1: FIELD 1 FIELD 2 FIELD 3 FIELD 4 FIELD 5
Row 2: Val1 Val2 Val3 Val4 Val5
Row 3: Val1a Val2b Val3c Val4d Val5b

Hi there!

Take a look at Column Expression node or String Manipulation node. There are join operations for strings. Afterwards filter unwanted column(s).

Br,
Ivan

Thanks ipazin.

I’m fairly new to Knime and a poor programmer [part of the reason I love Knime is it’s visually intuitive]; however, I can reverse engineer things ok :slight_smile: To that end, I’m hoping the community can share some example workflows after looking through the attached files, “JackedTextFile.txt” as input and “FixedTextFile.txt” as expected output.

In the mean time, I’ll play around with the string manipulation node and see if I can get it to extract, append on previous row, loop…

FixedTextFile.txt (99 Bytes)
JackedTextFile.txt (395 Bytes)

Hi @artwebb!

Don’t worry. I like it too :slight_smile:

Instead of rows I was thinking on columns in my first post so ignore it. At least String Manipulation node :slight_smile:

Your input is a bit mixed - you have spaces and tabs so it is hard to import it into Knime. Create a more representative input for your test.

I have created a workflow as an example (Read previous row and join.knwf (14.6 KB)). There is a couple more thing to do for sure. Take a look. There is description under each node to understand what is the idea. I use Column Expression node which is an extension and needs to be installed. (https://www.knime.com/knime-introductory-course/chapter1/section1/install-knime-extensions)

To import workflow into knime see this:

Br,
Ivan

Thanks a lot for the help @ipazin ! Sorry about the mixed input. I quickly drafted those two text files and seem to have gotten a bit sloppy. Wish I could have provided some actual samples; however, they contain PII and no easy way to replace or obfuscate sensitive data.

Hoping I get a chance to run through this in the coming day(s) and will circle back with questions/feedback.

2 Likes

Hm, either I’m doing it wrong or this is more complicated than I thought. looking into python code might also be an alternative solution.

Here is my solution using the Lag Column Node. But that alone isn’t enough. Also this will only work if there are at maximum 2 rows in the file per database row. If a row takes up 3 rows, it will fail.

fixed length file.knwf (21.4 KB)

Thanks @beginner! I have a collogue who wrote a Python script to accomplish this; however, I always seem to have something in my system/setup that causes errors with Python, R, etc, and thought Knime would be a great alternative. My preference is to avoid writing code and instead, make use of node functions, as they’re more intuitive [at least for me].

That said, I’m not getting this flow to work either. Hoping to re-write the manual steps I take to clean these data this weekend. Perhaps that will help folks better understand exactly what I’m looking for. Stay tuned!