Read in multiple text files

Dnreb · April 17, 2014, 1:01pm

Hello,

I try to read in several text files from a directory.

Every document contains strings which I want to analyse.

Using Flat File Document Parser this works fine to get documents.

Is there a way to get the file names and created date as well as meta information into the documents or can this be added at a later step?

Thanks

kilian.thiel · April 23, 2014, 12:00pm

Hi,

with the Document Data Extractor you can extract the file path from parsed documents.

There is no node to get the creation date. So you would need to use a Java Snippet node and extract the creation date via Java code. Maybe this can help: http://stackoverflow.com/questions/21033928/how-to-get-proper-file-creation-date-of-file.

Cheers, Kilian

Dnreb · May 7, 2014, 5:11pm

Thanks for the help!

TimB · June 11, 2014, 9:37pm

Hi Dnreb,

just a small addition: one can also easily encode some file features in the filename itself e.g. by ARen (Advanced Renamer, freeware). There is a vast diversity of file features from simple creation or modification date to numbering within folders or read out device with which an image was recorded with and all other types of exif information that can be read out and e.g. be appended to the file name.

In a second step these features (written to file name) can be easily extracted if consisted separators are used with ARen.

RPattela · August 24, 2016, 5:09pm

Hello,

I have huge taxt files with same foramte and without delimiter like below.

4243919840103 00000001 000770600013RGT-WAY DED 00000000 CLARK FRANCIS

4243919850102 00000001 000804602044POWER ATTY 00000000 CHANCELLOR FANNIE

4243919890103 00000001 000947500944WARNTY DEED 00000000 MBANK MIDCITIES N

I built workflow using "List Files , Table Row To Variable Loop Start , File Reader, Loop End nodes" respectively for pulling all txt files. Here i am getting large number of duplicate values.Please help me out.

system · June 2, 2023, 9:48pm

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.