I have a file 1 with 3 columns and file 2 with 1 column. The task is to check if the values from column_1 from file 1 exist in file 2.
I have implemented this with Python node (2=>1) wherein I could read 2 input files data and perform the operation.
I would like to know if there is any JAVA node that will allow me to read 2 separate files as input. Please assist.
Not that I know of. You need to use one reader node for each file and then either join tables based on column of interest using Joiner node either go with Reference Row FilterReference Row Splitter node to see which values are there/missing.
I have never tried the Reference Row Filter and Reference Row Splitter. I’ll try it out.
For now, I converted one of the file’s column data as flow variables as lookup values and comparing it against the column from the second file. The processing times are really high 'coz of that implementation. Hopefully, these 2 nodes reduce it.
if I got it right you are using loop currently and then is not efficient as each value is one iteration. Mentioned nodes will definitely reduce it. If you’ll have trouble implementing it you can share dummy input data and desired output and I can create example workflow
I have attached the workflow with sample data.
File_1 has over 100,000+ records, contains null values and duplicates
File_2:LOOKUP can have more than 3000+ records of unique values.
I do not want to remove the duplicates from File_1 as they are connected to other columns.
I hope this test data helps and look forward to your solution
so you can go with Cell Replacer or Joiner node or Reference Row Splitter depending on what kind of output you want. See example attached and if any questions feel free to ask.