Hi @srinibash1980 , I’m pleased it worked for you and thank you for marking the solution as that will help others find it.
Attached is an updated version demonstrating processing of all pdf files in a folder. It uses a loop and with each iteration adjusts the config of the pdf component using flow variables.
For demo here, I used the file name for each pdf as the basis for creating a new output folder, so the extracted pages for each file go into their own output folder. Obviously you can do that, or you can just leave the output folder alone on the component as before.
I hope that gives you the ideas/help you need for your own use case.
PDF Split using java with folder processing.knwf (2.6 MB)
[EDIT - I should add the website reference that was the source of the information/java code used in writing this solution:
I adapted the code from there to place in the java snippet.
For this to work, it required two java .jar files:
- pdfbox-2.0.26.jar
- commons-logging-1.2.jar
These are open source and were downloaded from:
Apache PDFBox | Download
Apache Commons Logging - Download Apache Commons Logging
I created a folder inside the workflow folder itself called “java-classes”
Then on the java snippet, I added these .jar files as additional libraries using the Add KNIME URL button so that I could add files contained within the workflow folder:
Enjoy!