use R to list excel sheet names, extract the data and keep only columns that are present in all sheets use R package readxl to list all sheets of excel files from a folder, determine their sheets and columns and guess the type. In the end keep only those columns and data that are present in all files I built a solution but you may want to check it out if it works for you. With R I check all the sheets in the excel files from a folder. The sheets get imported and read back into KNIME the type is determined by a guess from the first 50k lines. Then I try to find out which combination of type and column name is there the most (all of the time - you might adapt that) and then only those are kept. But initially, all the data is loaded into KNIME so you might use it later. Filename and sheet-name are stored for later use.

This is a companion discussion topic for the original entry at https://kni.me/w/doV7D-HJPJDb2VDe