Multiple CSV files within a Single Datafile

smcleod · October 14, 2017, 4:05am

Folks I need a concept/idea to start out a project. A software tool my company uses produces a "backup" file in a very unique form. It generates multiple CSV formatted files inside a single source datafile. Each CSV is separated by the original table name within brackets and a blank line. Then the column headers followed by the data lines. This one file has about 100 tables defined in the single source file.

I'm looking for a way to read the source file and break it down into one file physical for each table so I can process them with CSV readers. An example of the source file looks something like this:

[ACCOUNTS]

"NAME","PHONE","ADDRESSS"...

"ATT","555-123-456","1234 MAIN STREET",...

"SPRINT","555-123-456","456 MAIN STREET",...

"VERIZON","555-123-456","1234 MAIN STREET",...

[CONTACTS]

"USERNAME","USERTYPE"

"SCOTT SNITH","GENERAL USER"

"BILL JOHNSON","GENERAL USER"

"JILL HOLT","ADMIN USER"

rchar01 · October 14, 2017, 6:34pm

Hello,
I attached my solution (workflow.svg):
test.csv - input data
ACCOUNTS.csv - loop output (iteration 0)
CONTACTS.csv - loop output (iteration 1)

Workflow loops through all parts of input file and creates separate files for each ([header] + data), description:
node 126 (126.png) - imports all data into one column (no blank rows)
node 127 (127.png) - creates column (output file name) only with header data, regexp = "^\\[(.*)\\]$" matches only string which starts with "[" and ends with "]".
node 128 - fills empty rows in new column (output file name) with data from previous row (string=previous value)
node 130 - group by new column (output file name)
node 134 (134.png) - creates output file path
node 131 - filters only data column

Best,
Robert

mauuuuu5 · October 15, 2017, 12:25am

Thank you Robert for the workflow, but I replicated it and the workflow writes the files in a unknown folder.

I beg you if could you please check the attached the workflow

Best regards

19_Loop_over_files_to_read.knwf

rchar01 · October 15, 2017, 11:57am

my remarks:
node 2 - don't replace data column, create new one
node 3 - choose 'previous value' option
node 4 - include only new column (group by)
node 7 - choose flow variable (outputfilupath) as your output path

workflow in attachment.

Robert

19_Loop_over_files_to_read.knwf