Converting a raw text file into a data table

Dear All,

I have probably quite a simple problem: I want to process a raw txt file with Knime into a data table so I can do some statistics with it. I have tried using the 'Flat File Document Parser' and the Line Reader but I don't get to the point where I get one column for each data set. An example for my text file looks like:

 

07.06.2013
CHERRY
TABLE
BLUE
17:30
00:15#253,000
00:30#253,000
00:45#253,000

 

08.06.2013
CHERRY
CHAIR
BLUE
23:58
00:15#254,000
00:30#254,000
00:45#254,000
[...]
 

And I want this text file to look like a table, so I can do some statistics later.

Thanks for your help!
aloe

The file reader will be your best bet.

It will automatically scan your file to try to pick up the correct settings. If this doesnt work out you can always manually adjust them.


What is the delimiter for the files? Its not obvious looking at the example?

 

Cheers

Sorry, I was probably too unclear with my description. My text file does not have any delimiters! What I want to do, is to process this text file in a way that I get tabular data. In other words, to transform this text file into value delimited data.

 

To explain a bit more: This text files contains repetitive elements:

 

Date
Furniture
Color
Time
followed by some Data

 

Is there any way I can process this text file, so that I get a table?

What do you want the table file to look like?

The table should look like this (if I stay with my example)

 

Column1 Column2 Column 3 Column4 Column5 Column6 Column7 Column8

07.06.2013 CHERRY TABLE BLUE 17:30 00:15#253,000 00:30#253,000 00:45#253,000

08.06.2013 CHERRY CHAIR BLUE 23:58 00:15#254,000 00:30#254,000 00:45#254,000

 

Later I want to process this table into:

Date Fruit Furniture Color Time 00:15 00:30 00:45

07.06.2013 CHERRY TABLE BLUE 17:30 253,000 253,000 253,000

08.06.2013 CHERRY CHAIR BLUE 23:58 254,000 254,000 254,000

Are the number of time elements constant at 3 or does this change?

The time elements are constant -- though not 3 in real life but 96 (every 15 minutes in 24 hours).

Wow! Your example works perfectly -- it is exactly what I was trying to do. Thank you very much. I have never used the 'Chunk Loop' node before. It seems to work great for my purpose.

 

This also helps me to understand Knime better. Thanks, Sam!

 

Cheers,

aloe

Ok, I've put together a crude example of a way to do this. 

 

It is dependant on you having a constant number of results though. In this example 8 data row and a blank row (the blank row can be missing at the end of the file though. 

 

You will need to change the red sections to the values in your actual data (96 time rows plus the 5 data rows at the start)

 

I've also included the text file I used.

Regards

 

Sam

1 Like

Dear All,

I know that This message was posted on 2013 and now we are in 2016 but I think that my problem is quite similar to this one : I want also to process a raw txt file with Knime into a data table, the difference  in my problem is that I don't have a constant number to split, as follow:


EVENTSTART||ID||154556358||ref||845535
MEASURE||COLOR||RED||
MEASURE|1|GLOSS||12.121||
MEASURE|1|SAS||1.0021||
MEASURE|1|LAR||15.974||
.
.
.
EVENTSTOPEVENTSTOP||Result||1||


EVENTSTART||ID||15454154||ref||84548784
MEASURE||COLOR||GREEN||
MEASURE|1|GLOSS||2.154||
MEASURE|1|SAS||2.0012||
MEASURE|1|LAR||13.125||
.
.
.
EVENTSTOPEVENTSTOP||Result||1||


EVENTSTART||ID||15454178||ref||845487452
MEASURE||COLOR||RED||
MEASURE|1|GLOSS||12.94||
MEASURE|1|SAS||4.1220||
MEASURE|1|LAR||14.0012||
MEASURE|1|USUR||ERROR||       <----- an additional row here
.
.
.
EVENTSTOPEVENTSTOP||Result||0||


Thank you for your help !

Hello khedy,

You can get inspired by the example here (basically the same, but with two levels and different delimiters).
Then you get a table like this:
 

Event Measure
EVENTSTART||ID||154556358||ref||845535 MEASURE||COLOR||RED||
EVENTSTART||ID||154556358||ref||845535 MEASURE|1|GLOSS||12.121||
EVENTSTART||ID||15454154||ref||84548784 MEASURE||COLOR||GREEN||
EVENTSTART||ID||15454154||ref||84548784 MEASURE|1|GLOSS||2.154||

Now you can use two String Manipulations to split Measure into two columns, variable and value.

Event Measure Value
EVENTSTART||ID||154556358||ref||845535 COLOR RED
EVENTSTART||ID||154556358||ref||845535 GLOSS 12.121
EVENTSTART||ID||15454154||ref||84548784 COLOR GREEN
EVENTSTART||ID||15454154||ref||84548784 GLOSS 2.154

Now a little pivoting and you're done.

Best,
Ferry

Hi @swebb ,

Interesting post - I have a similar problem combining a number of text files (that differ in terms of number of columns) and appending them together. Are you able to share a screenshot of you node’s sequence for ‘aloe’ project - I am not able to see it properly/open it.

Thanks in advance

@Pawel40Adi maybe it would be best to open a new forum entry and describe what you want to do and maybe give us some (dummy) sampel data that would represent your problem. In the meantime you could explore this older collection regarding the import of CSV/TXT files:

Also you might want to check if just allowing the dynamic change of file structure in the CSV Reader might solve your problem:

1 Like

Sorry that post was from nearly 10 years ago by the looks of it, no chance I’ll have that workflow laying around.

1 Like

@swebb is it by any chance this one? It is above in the ZIP file:

kn_forum_2734_aloe_import_txt_file.knwf (82.3 KB)

1 Like

Yep that’s it, I took it at face value that the aloe.zip wasn’t openable but seems it is :slight_smile:

1 Like

Apologies, missed the date :slight_smile:

Great thank you @mlauber71

1 Like