hi,
i have a list of 31 text file (log file) and I want to read all of theme and make theme in one single file. how can i do this, thank’s for your help.
access.zip (161.5 KB)
here the data.
Howdy, have you attempted the TIKA parser? Or List of files being looped with a file reader?
Tika parser allows you anyone to bulk query a directory.
I built a python application to append .txt to the backend of each .1, .2, .3 log file… .txt being a relevant file i know tika parser can open. maybe not relevant for your task but it’s a task non-the-less for this solution below. you might find it easier to append something else, your call.
Next, drag and drop tika parser to the KNIME analytics platform surface.
Open tika settings.
access folder with relevant files.
bottom portion outputsettings, you only need “Content” and “Filepath” selected.
filepath… is on top and unique identifier to the log file name, if that’s necessary or a requirement, often this is which is why im leaving it, however with parsing log files, i doubt anyone cares. leave it there as a parent to the content children.
content… last but not least, as it’s everything you need.
At this point, I’ve finished your use case within knime, now it’s up to you to export to storage.
(removing IP addresses from screenshot as I don’t think you want to include IP addresses on public facing forums)
Completed view: if i wanted to keep it within knime, VS pushing it back into another file, i’d do this.
why i like tika parser; i like using this tool for PDF files to help me avoid hiring account/finance people for tax related sorting through bank statements. it allows me to bulk query a directory of pdf files. however tika parser doesn’t always work. It requires you to think a little bit about how much data is being processed in a single node, at a single time, and that will later change the way you do this step.
you can also loop over files individually, unlike tika parser which gets it all at once, and removes the need for me to explain how to union data or append data downstream. tika parser is your one stop shop for this granular request, however based on your environment, you may want to go with…
“List files” node will list all of the files you’re looking to parse with a data table in KNIME.
“Table row to variable” will swap that list into a variable, so that you can loop it on another tool.
Why this direction? Maybe it’s easier to filter this way, VS filtering something which includes the content.
Note, tika parser could be a good way to get the ball rolling. Then later swap to this looping process, depends how you want to maintain and how giant this requirement will be tomorrow.
List of files will need a blue node, Chunk loop start and End, to allow the iterations to happen per file in your list.
Good luck
Best,
Tyler
Hi, how about this flow KNIME_project.knwf (133.3 KB) .
Via List Files you can list all the access_log files. Read them in one by one via a Table Row to Variable node. The Constant Value Column adds the name of the location of the file.
gr. Hans
The Load text-based Files
node in the Vernalis community contribution will read entire files into single cells:
Then you can use a a groupby nodeto concatenat them all (no grouping column) and the Save File locally
node can then output each cell to a single file:
Steve
This topic was automatically closed 182 days after the last reply. New replies are no longer allowed.