I have copied all the articles of our site ZDnet.be to an Exell spreadsheet. for 2016 it is a spreadsheet with approximately 4000 rows and 5 colums. In the "content column" all the articles are postes. Every row is a new article. I want to read this spreadsheet (especially the Content column) in a texprocessing node. I have tried XLS reader followed by Strings to Term (didnot work) and XLS reader followed by Strings to Document (didnot work). I want to datamine all the articles integral (so not an analysis on every separate article/row) because I want to be able to tell something about the content of the whole site.
Which workflow do I have to follow to be able to read this excell spreadsheet followed by the Textprocessing module?
I have copied the first 5 rows in the attachment so you can see the structure of the spreadsheet