Hi experts. I have a text file which size is 6.8 MB. After loading it to Flat File Document Parser and executing its start processing till 99% and after that it stuckes there for hours without any error. Any suggetions please.
Best wishes
Alam
Hi experts. I have a text file which size is 6.8 MB. After loading it to Flat File Document Parser and executing its start processing till 99% and after that it stuckes there for hours without any error. Any suggetions please.
Best wishes
Alam
Hi Alam,
can you share the txt file, zipped maybe?
Cheers, Kilian
Hi Alam,
it works for me. I can read the attached file, using the Flat File Reader. However, it takes around 5 Minutes to parse it. This is due to the underlying word and sentence tokenization which is applied during parsing. How much Xmx heap space have you assigned for KNIME? You can change the Xmx setting in the knime.ini file, which is in the directory where the knime binary is located. An increase of that Xmx number could speed up tokenization due to less GC.
Cheers, Kilian
Hi Kilian. I already changed my Xmx heap space to 2048. But its not working. I am also attaching my workflow maybe i am doing something wrong. And thanks a lot for your help :)
Regards
Alam
Hmm I just tried your workflow and it works for me, also with 2GB Xmx settings. However in that workflow I am just reading the 1500.txt file (I don't have the others). I assume it is one of your files that causes the Flat File Reader some troubles. Can you put each of your files in a separate directory and read the files from these directories with different Flat File Readers. You can concatenate the data tables with the Concatenate node afterwards. Then you see which files can be parsed and which cause problems.
Cheers, Kilian
Thanks Kilian for reply. Actually other files are bigger than this even this one is not working then how could others work? I have Core2 laptop with 4 GB memory with Windows 7 OS. I dont know why its not working? I tested a small file like 2.7 kb and it worked.
Best Wishes
Alam
It is not necessarily the file size it is more the structure of the text that might cause problems. When the strings are conerted into documents tokenization (word and sentence) is applied. Therefore openNLP tokenizer model are used. If the text is not structured like natural language text, e.g. there are not periods in the text, the sentence tokenizer mode might have problems.
The file that you shared is working on my machine with just 2GB heap for KNIME. I assume the problem is caused by another file.
Cheers, Kilian
Hi Kilian.
What version of Knime you are using?
Best wishes
Alam
I could read your file using 3.0.1. With 3.1 it is working as well.
Cheers, Kilian
Thanks a lot Kilian.
Best wishes
Alam