As explained in my earlier post, the organisation has applied Azure Information Protection on individual documents, whether spreadsheets, word documents, or other. This means that I can only access those files programmatically if the program also authenticates as an internal user. As the main document storage we have a Share Point on-prem deployment, so in our case the Microsoft Authentication and Sharepoint Online nodes can not be used.
I have been able to use the Erlwood List Files with Authentication node. This is a step forward, because with the Azure Information Protection applied on a document, even the generic List Files/Folders node fails. However, this is only a little help as now I want to start analysing the content of those documents.
Anyone have any solutions in mind?
From KNIME’s side, I wonder if the applicability of the Microsoft Authentication node could be extended so that it could be used in combination with reader nodes such as Excel Reader or Text Processing IO nodes such as Tika Parser?
From KNIME’s side, I wonder if the applicability of the Microsoft Authentication node could be extended so that it could be used in combination with reader nodes such as Excel Reader or Text Processing IO nodes such as Tika Parser?
What KNIME AP version are you using? With the new file handling system (from KNIME 4.3.0 on) we actually do that – not yet with all nodes, but with many of the basic ones (i.e. csv reader, excel reader). You can read more about it in our blog: New File Handling Out of Labs and Into Production | KNIME
Now, back to the rest of your question… I am currently investigating internally. But have you already tried with the new file handling system? This could maybe solve your problems.
Yes, I 'd like to use the new file handling approach, including the Microsoft Authentication. Now the issue is only that because our SharePoint is not the Online version, I cannot use the Share Point connector. In other words, I need something in between the MS Authentication and the List Files/Reader/Parser nodes so that I can process files stored locally/in the company network after authenticating as an internal user. Otherwise I am not able to open or even list the files that are protected using Azure Information Protection.
So sorry for the delayed replied, I have seen that one of my colleagues has replied to you via email. I will just add here for completeness (and in case others run into the same issue): you are correct, as of today we do not support Sharepoint on-prem. Only online versions. There is a feature request for this in our internal ticket system, I will give it a +1.