conditional Execution

Subramanyam · May 12, 2025, 8:04am

Hi Team,

I want to compare two excel file columns,
If the column names in both files are same then I need to import the data from first file and next proceed further transformations on the data.

Please suggest me the flow as I am new to KNIME.

Thanks,

rfeigel · May 13, 2025, 1:40am

Could you provide more detail about how your data is configured?

Do you have two separate workbooks?
How many columns in each of the worksheets that you want to compare?
If more than one column in each worksheet, must all of the columns match?
It would be helpful if you could upload some sample data.

Subramanyam · May 13, 2025, 8:41am

Hi @rfeigel,

All columns in File2 has to match with File1 columns.
if columns of File2=columns of File1 then

Read the data from File1 followed by other operations has to perform on File1 data.

I cannot provide the sample data.

Thanks,

iCFO · May 13, 2025, 11:37am

Are there just 2 files to compare, then if they don’t match the workflow stops? Or could there be a large number of excel files in the folder which need to be tested and matched to each other?

I think that if I were testing large files or a large number of files, the basic approach I would probably take is to read in the files with the excel reader but set it to only read in the headers on all columns (perhaps a few clean up nodes if you need to make sure that the headers are on the same row as usual), then use table difference finder to test for differences, then a case switch to trigger the rest of the workflow if there is a match. If there is a match, then the workflow would read in the full excel files and handle ETL. This could be performed in a loop if you have a large number of files to test.

If you are just testing 2 small excel files, then it is easier to just read the full data of both files, use the column name extractor to get just the headers + table difference finder + case switch, then continue the downstream processing when the match condition is met.

Subramanyam · May 13, 2025, 11:51am

Hi @iCFO,

There are multiple files which neds to be verified one by one and read and then perform ETL if the condition matches.

rfeigel · May 13, 2025, 1:57pm

If you have 10 files, what would you expect the workflow to do?

iCFO · May 13, 2025, 3:32pm

I am thinking that the best way is probably to read in the header columns of all of the excel files in your target folder using the excel reader, along with the excel reader option of appending the file name / location as a new column. That should concatenate them into a table for you, which will allow you to match them as rows on a single table in 1 easily reviewable step (perhaps the duplicate row finder ignoring file info columns), and provide the matched files / locations to pass to the rest of the workflow. I am on a time crunch work project for the next few days, but might be able to mock up a workflow later in the week if no one can get to it first.

iCFO · May 13, 2025, 4:36pm

You should be able to read in the headers of all files with just the excel reader if you set the column names by column position as numbers or names instead of targeting a row that contains the column names to be used as KNIME headers.

Also make sure not to skip empty or hidden columns if you are trying to exactly match the structures. Also uncheck “Fail if schemas differ between multiple files” to help avoid issues and see the output as you dialog the settings.

prashant7526 · May 20, 2025, 10:26am

Hi @Subramanyam,

Lets get back to your two file solution. I just did it few weeks ago.

When you have 2 files:

Is it mandatory that both files have same column name? If not do you want to extract the data only of those columns that matches on both files discarding other column?
If you have different column names in both file for example:
EmployeeID in first file should match with EmployeeNumber of second file is this scenario is possible as well? (When you say both file column should be match)?

I understand you cannot share the data. But can you share the headers (without any data) from two files as an example and explain?