Read in multiple text files

Hello,

I try to read in several text files from a directory.

Every document contains strings which I want to analyse.

Using Flat File Document Parser this works fine to get documents.

Is there a way to get the file names and created date as well as meta information into the documents or can this be added at a later step?

 

Thanks

Hi,

with the Document Data Extractor you can extract the file path from parsed documents.

There is no node to get the creation date. So you would need to use a Java Snippet node and extract the creation date via Java code. Maybe this can help: http://stackoverflow.com/questions/21033928/how-to-get-proper-file-creation-date-of-file.

Cheers, Kilian

Thanks for the help!

Hi Dnreb,

just a small addition: one can also easily encode some file features in the filename itself e.g. by ARen (Advanced Renamer, freeware). There is a vast diversity of file features from simple creation or modification date to numbering within folders or read out device with which an image was recorded with and all other types of exif information that can be read out and e.g. be appended to the file name.

In a second step these features (written to file name) can be easily extracted if consisted separators are used with ARen.

Hello,

I have huge taxt files with same foramte and without delimiter like below.

4243919840103        00000001    000770600013RGT-WAY DED             00000000              CLARK FRANCIS

4243919850102        00000001    000804602044POWER ATTY              00000000              CHANCELLOR FANNIE

4243919890103        00000001    000947500944WARNTY DEED             00000000              MBANK MIDCITIES N

I built workflow using  "List Files , Table Row To Variable Loop Start , File Reader, Loop End nodes" respectively for pulling all txt files. Here i am getting large number of duplicate values.Please help me out.