Ms word parser issue

Hi when trying to load a doc file which has embedded images present, the node fails with an error message. Is it possible to upgrade this node to handle these types of documents.

thanks

simon.

Hi Simon,

 

to parse the doc files the Apache POI lib is used internally. Thank you for the bug report. I will put it on the list and see what we can do about it.

 

Cheers, Kilian

Hi

Its my first experience using Tex Processing, i have a deeper understanding with the other nodes.My challenge is on configuring the word doc nto use word parser.how does the Apache POI Lib work

Hi,

in the Word Parser dialog you can specify the directory with the .doc files to parse. Note that only .doc files are parsed. .docx files are not parsed so far. The Word Parser node parses all .doc files of the specified directory. For each file one row, containing one document is created.

Cheers, Kilian

Hi Simon,

i tried to reproduce the problem but failed. My word files including images (real images, as well as word drawings) can be parsed properly without errors. Can you send me a word file that produces errors?

Thanks, Kilian