Read in multiple files

RPattela · August 29, 2016, 11:39am

Hello,

I have huge txt and CSV files with same foramte and without delimiter like below.

4243919840103 00000001 000770600013RGT-WAY DED 00000000CLARK FRANCIS

4243919850102 00000001 000804602044POWER ATTY 00000000CHANCELLOR FANNIE

4243919890103 00000001 000947500944WARNTY DEED 00000000MBANK MIDCITIES N

I built workflow using "List Files , Table Row To Variable Loop Start , File Reader, Loop End nodes" respectively for pulling all txt files. Here i am getting large number of duplicate values.Please help me out.

marco_ghislanzoni · August 29, 2016, 12:30pm

Hello,

are the duplicates in the dataset and you need to remove them after reading them in or are you hinting to the fact that the way you are reading in the data creates duplicates?

This is unclear to me from your description of the issue.

Cheers,
Marco.

RPattela · August 30, 2016, 7:09pm

Hello,

Thanks for your reply. There are no duplictes in my data, when i am trying to pull all files i am getting duplictes(When i try to pull all files from ALTERYX tool I am not getting any duplicates).Is there any option to remove duplicates in Loop nodes or after reading all files?

marco_ghislanzoni · August 31, 2016, 8:42am

Hello,

I understand there is something in your workflow which generates duplicates. Rather than looking for a way to remove those duplicates I would fix the workflow so they are not generated in the first place.

Can you share your workflow or a screenshot of it here so we can try to help you with it?

Cheers,
Marco.

RPattela · August 31, 2016, 2:10pm

Hello,

Please find the attachments of sample txt file, like this i have multiple txt file with same format and attached sanpshot is my workflow design.

marco_ghislanzoni · August 31, 2016, 2:29pm

Hello,

I don't see any evident reasons why this workflow would create duplicates, unless multiple copies of the same files appear under different names/folders in the output of the List Files node. Any chance you ticked the Include subfolders option and there is a subfolder with a copy of the input files?

I would suggest you run the loop step by step and watch the output after each step to check when the duplications are occurring, then work backward from there.

Cheers,
Marco.

mikeevge · July 13, 2017, 5:19pm

I have the same problem with RPattela dublicate values when reading multiple sdf files I am doing exactly the same workflow with RPattela any solution yet?