Automatically multiple folder access

alexgruber · April 15, 2020, 8:59am

Hallo to the Forum,
first of all thanks for your support in general. Your answers to some questions really helped me out in the past.

Currently I do have a specific problem, which I can´t find yet.

I do have 15 different local directorys in my workspace. Each of them includes 5 csv data files, which are from a 6 weeks time window. These 5 csv data files are structured different between each other.

Directory 1 (1st 6 weeks):

Data set A
Data set B
…
Data set E

Directory 2 (2nd 6 weeks):

Data set A
Data set B
…
Data set E

Every 6 weeks my customer sends me this 5 csv data sets with the new values again, because of limited storage space. Data set A of each time window has an identical structure. Also set B, C, D and E.

The 5 different csv data sets, go into an ETL-Workflow (already created). The result is one jointed csv file for one time window of six weeks.

My objective is, that KNIME automatically takes the 5 data sets from the first directory, puts these into the ETL-workflow and generates the first jointed file. KNIME should do this for the 14+ other directorys the same way with the ident workflow.

At the end I have 15+ jointed files, which will be concatenate to one master file afterwards for a visualization.

I´m not quiet sure how to handle the whole problem in general. Hopefully I have explained the problem in an understandable way for you.

Many thanks in advice.
Alex

izaychik63 · April 15, 2020, 2:01pm

If you use database for storing data? The best is to load everything in DB and visualize. No folder jumps and file joins needed. Consider SQLite or H2O DB. Look for examples to use.

alexgruber · April 16, 2020, 7:09am

Tanks for you answer @izaychik63.

The use of a databank is certainly the best sustainable solution.
But is my written qestion in any way feasible without a databank?

Many thanks,
Alex

ipazin · April 21, 2020, 3:16pm

Hi there @alexgruber,

I think it is feasible. Don’t know you exact folder structure but if you have all 15 directories under same folder you can do following:

use List Remote Files node to catch all 15 directories
from there use Table Row To Variable Loop Start to process each directory in one loop iteration
then use List Files node to grab 5 csv files from first directory
do ETL (either Metanode/Component or call another workflow)
end loop with appropriate loop end node

Something like this:

ETLs

Hope this helps!

Br,
Ivan

alexgruber · April 23, 2020, 7:34am

Hi @ipazin,

thanks a lot for your answer. I have already solved it with this approach. It works perfect.

Best regards,
Alex

ipazin · April 23, 2020, 1:50pm

Hi @alexgruber,

glad to hear that!

Br,
Ivan

system · April 30, 2020, 1:50pm

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.