extract specified text from a log file

Hi,
I am trying to extract some contents from a log file with specified start and end text. The log file looks like:
TEST_TIME_BY_TEST:
SPC_ID OCCURRENCE SEQ TEST_SEQ_EVENT TEST_NUMBER ELAPSED_TIME PARAMETER_NAME TEST_STATUS CELL_TEMP SZ RSS CPU_ET DISC_RD_BYTES DISC_WR_BYTES
10000 73 22 60 299 48.84 MeasureATI_299 0 48.0 52328 32948 628.25 647013688 150527748
Aug 09 2018-08:46:23 atiMargin = 0.04395, len(ATImeas) = 60
Aug 09 2018-08:46:23 maxBpiTweak = 0.02662, atiMarginMin = -0.48111, atiMarginMax = 0.88732, atiMargin = 0.04395
Aug 09 2018-08:46:23 hd zn iter lognw ATI_Margin bpitweak final_meas_BPI
Aug 09 2018-08:46:23 QH len(ATImeas) = 60
Aug 09 2018-08:46:23 hd zn iter lognw ATI_Margin bpitweak meas_BPI meas_TPI atiAdj DOS_CNT
Aug 09 2018-08:46:23 0 0 0 -0.21973 -0.26369 -0.00791 1.02612 1.18892 -0.00791 120
Aug 09 2018-08:46:23 0 33 0 -0.12275 -0.16670 -0.00500 1.00389 1.18768 -0.00500 150
Aug 09 2018-08:46:23 0 74 0 -0.14863 -0.19258 -0.00578 0.99641 1.17677 -0.00578 141
Aug 09 2018-08:46:23 0 115 0 -0.10715 -0.15111 -0.00453 0.99965 1.17253 -0.00453 155
Aug 09 2018-08:46:23 0 171 0 -0.22445 -0.26840 -0.00805 1.02014 1.17660 -0.00805 119
Aug 09 2018-08:46:23 0 239 0 0.08039 0.03644 0.00000 1.01519 1.19294 0.00000 240
Aug 09 2018-08:46:23 1 0 0 -0.14224 -0.18619 -0.00559 0.99716 1.21399 -0.00559 143
Aug 09 2018-08:46:23 1 33 0 -0.11361 -0.15756 -0.00473 0.98416 1.20692 -0.00473 153
Aug 09 2018-08:46:23 1 74 0 -0.25001 -0.29396 -0.00882 0.96913 1.20404 -0.00882 112
Aug 09 2018-08:46:23 1 115 0 -0.09048 -0.13444 -0.00403 0.94582 1.20844 -0.00403 162
Aug 09 2018-08:46:23 1 171 0 -0.00312 -0.04707 -0.00141 0.94565 1.21652 -0.00141 198
Aug 09 2018-08:46:23 1 239 0 0.15758 0.11363 0.00000 0.93814 1.23412 0.00000 286
Aug 09 2018-08:46:23 2 0 0 -0.06472 -0.10868 -0.00326 1.04067 1.15923 -0.00326 171
Aug 09 2018-08:46:23 2 33 0 -0.31547 -0.35942 -0.01078 1.01383 1.15857 -0.01078 96
Aug 09 2018-08:46:23 2 74 0 -0.25703 -0.30098 -0.00903 1.00747 1.14830 -0.00903 110
Aug 09 2018-08:46:23 2 115 0 -0.04982 -0.09378 -0.00281 1.00571 1.15072 -0.00281 177
Aug 09 2018-08:46:23 2 171 0 -0.28422 -0.32817 -0.00985 1.02398 1.16806 -0.00985 103
Aug 09 2018-08:46:23 2 239 0 0.09480 0.05085 0.00000 1.01963 1.18090 0.00000 248
Aug 09 2018-08:46:23 3 0 0 -0.34721 -0.39116 -0.01173 1.04267 1.14181 -0.01173 89
Aug 09 2018-08:46:23 3 33 0 -0.27224 -0.31619 -0.00949 1.01678 1.14851 -0.00949 106
Aug 09 2018-08:46:23 3 74 0 -0.13863 -0.18258 -0.00548 1.00739 1.15042 -0.00548 145
Aug 09 2018-08:46:23 3 115 0 0.09305 0.04910 0.00000 0.99714 1.15360 0.00000 247
Aug 09 2018-08:46:23 3 171 0 -0.07134 -0.11529 -0.00346 1.01741 1.16857 -0.00346 169
Aug 09 2018-08:46:23 3 239 0 -0.23442 -0.27837 -0.00835 1.03481 1.17424 -0.00835 116
Aug 09 2018-08:46:23 4 0 0 0.11481 0.07085 0.00000 1.00542 1.19214 0.00000 259
Aug 09 2018-08:46:23 4 33 0 0.13775 0.09380 0.00000 0.98248 1.18280 0.00000 274
Aug 09 2018-08:46:23 4 74 0 -0.02886 -0.07281 -0.00218 0.98077 1.17504 -0.00218 186
Aug 09 2018-08:46:23 4 115 0 0.28557 0.24162 0.00725 0.97073 1.18121 0.00725 385
Aug 09 2018-08:46:23 4 171 0 0.18897 0.14502 0.00435 0.99454 1.18111 0.00435 308
Aug 09 2018-08:46:23 4 239 0 0.88732 0.84337 0.02000 0.99582 1.18872 0.02000 1539
Aug 09 2018-08:46:23 5 0 0 -0.03755 -0.08150 -0.00245 0.98385 1.25983 -0.00245 183
Aug 09 2018-08:46:23 5 33 0 0.10312 0.05917 0.00000 0.95861 1.26045 0.00000 253
Aug 09 2018-08:46:23 5 74 0 0.10398 0.06003 0.00000 0.94446 1.26582 0.00000 253
Aug 09 2018-08:46:23 5 115 0 0.41825 0.37430 0.01123 0.92650 1.26478 0.01123 522
Aug 09 2018-08:46:23 5 171 0 0.13265 0.08870 0.00000 0.93480 1.28693 0.00000 270
Aug 09 2018-08:46:23 5 239 0 0.24233 0.19838 0.00595 0.95158 1.28934 0.00595 348
Aug 09 2018-08:46:23 6 0 0 -0.26778 -0.31174 -0.00935 1.03539 1.16872 -0.00935 107
Aug 09 2018-08:46:23 6 33 0 -0.09371 -0.13767 -0.00413 1.01711 1.16249 -0.00413 160
Aug 09 2018-08:46:24 6 74 0 0.08828 0.04433 0.00000 1.01020 1.15260 0.00000 244
Aug 09 2018-08:46:24 6 115 0 0.25642 0.21247 0.00637 1.00421 1.15650 0.00637 360
Aug 09 2018-08:46:24 6 171 0 0.45610 0.41215 0.01236 1.02381 1.16796 0.01236 570
Aug 09 2018-08:46:24 6 239 0 0.43703 0.39308 0.01179 1.03481 1.17963 0.01179 545
Aug 09 2018-08:46:24 7 0 0 -0.08941 -0.13337 -0.00400 1.03022 1.11098 -0.00400 162
Aug 09 2018-08:46:24 7 33 0 0.00190 -0.04205 -0.00126 0.99931 1.11453 -0.00126 200
Aug 09 2018-08:46:24 7 74 0 0.23870 0.19475 0.00584 0.99047 1.11573 0.00584 345
Aug 09 2018-08:46:24 7 115 0 0.24962 0.20566 0.00617 0.97885 1.12592 0.00617 354
Aug 09 2018-08:46:24 7 171 0 0.11664 0.07269 0.00000 0.97679 1.15290 0.00000 261
Aug 09 2018-08:46:24 7 239 0 0.50882 0.46487 0.01395 0.95970 1.17504 0.01395 643
Aug 09 2018-08:46:24 8 0 0 -0.48111 -0.52507 -0.01575 1.04466 1.16260 -0.01575 65
Aug 09 2018-08:46:24 8 33 0 -0.19963 -0.24358 -0.00731 1.01460 1.16268 -0.00731 126
Aug 09 2018-08:46:24 8 74 0 -0.04425 -0.08820 -0.00265 1.00965 1.15070 -0.00265 180
Aug 09 2018-08:46:24 8 115 0 -0.09806 -0.14201 -0.00426 0.99710 1.15123 -0.00426 159
Aug 09 2018-08:46:24 8 171 0 -0.07314 -0.11709 -0.00351 1.01066 1.16110 -0.00351 168
Aug 09 2018-08:46:24 8 239 0 0.86619 0.82224 0.02000 1.00997 1.19095 0.02000 1466
Aug 09 2018-08:46:24 9 0 0 -0.09778 -0.14174 -0.00425 1.00919 1.13032 -0.00425 159
Aug 09 2018-08:46:24 9 33 0 0.49567 0.45172 0.01355 0.98785 1.12187 0.01355 624
Aug 09 2018-08:46:24 9 74 0 0.15423 0.11028 0.00000 0.96919 1.12694 0.00000 284
Aug 09 2018-08:46:24 9 115 0 0.13056 0.08661 0.00000 0.95044 1.13820 0.00000 269
Aug 09 2018-08:46:24 9 171 0 0.23352 0.18957 0.00569 0.95763 1.14781 0.00569 341
Aug 09 2018-08:46:24 9 239 0 0.33144 0.28749 0.00862 0.96084 1.16729 0.00862 428
Aug 09 2018-08:46:24 ===> Entering CVbarFormatScaler Data Stale = False, bFormatDataIsStale = True, rptInLogFile = False
Aug 09 2018-08:46:24 ===> Exiting CVbarFormatScaler
Aug 09 2018-08:46:24 ===> Entering processFormatTable Data Stale = True
Aug 09 2018-08:46:24 ===> Entering getFmtTblFromDrv - rptInLogFile = False
Aug 09 2018-08:46:24 Suppress test 210: prm_vbar_formats_210

P210_VBAR_FORMATS:
SPC_ID OCCURRENCE SEQ TEST_SEQ_EVENT HD_LGC_PSN HD_PHYS_PSN DATA_ZONE BPI_FMT TPI_FMT
20100 74 22 1 0 0 0 140 209
20100 74 22 1 0 0 1 140 209
20100 74 22 1 0 0 2 140 209
20100 74 22 1 0 0 3 140 209
20100 74 22 1 0 0 4 140 209
20100 74 22 1 0 0 5 139 209

I want to extract the content from line contains “hd zn iter lognw ATI_Margin bpitweak meas_BPI meas_TPI atiAdj DOS_CNT” and ends at line with “===> Entering CVbarFormatScaler Data Stale = False”.

How can extract the content and make it as a table, the table looks like:
hd zn iter lognw ATI_Margin bpitweak meas_BPI meas_TPI atiAdj DOS_CNT
0 0 0 -0.21973 -0.26369 -0.00791 1.02612 1.18892 -0.00791 120


9 239 0 0.33144 0.28749 0.00862 0.96084 1.16729 0.00862 428

In the log file, there are a lot of portion contains the content, how to extract all of them?

Is it possible to specify the start and end (or length from start) text to extract them?

Any advice will greatly appreciate it!!

The simplest you can try - File Reader to load data ans then Rule-based Row Filter
with your context
like “Conext1*Context2”

Hi izaychik63,
Thanks for your reply!

I tried your solution, but Rule-based Row Filter Node created an empty data table.
I set the rule as:
$Col0$ LIKE "hd zn iter lognw ATI_Margin bpitweak meas_BPI meas_TPI atiAdj DOS_CNT *===> Entering CVbarFormatScaler " => TRUE

Is it correct?
The star(*) between the 2 keywords (contexts) indicates to extract what I need, right?

I did not realized the content is not in one line. For the block of lines find line number for the first content and for the second one and filter lines from between those row numbers by Row Filter. Then Use cell splitter to get columns. Also, see examples how to pass filter range as parameters.

izaychik63,
Followed your method, I got the desired data. Thanks for your help.

1 Like

Just for completeness: We have a weblog reader node as well.

https://hub.knime.com/knime/nodes/Web_Log_Reader*rPVED70lh8os_OYQ

I am not sure if it can be used here.

Cheers, Iris