PYTHON PDF Module

@Brain not sure I follow. What you will have to do is:

  • install Python on you machine (best to use Miniforge)
  • create a conda environment with the necessary packages using the py3_knime_pdf.yml file which contains the necessary configurations
  • install the Python script extensions for KNIME
  • tell KNIME where to find conda

If you have conda installed and told KKNIME where it is the point about creating the environment with the packages can be done using the metanode conda_python_pdf – KNIME Community Hub

My recommendation would be to read this article - it will help you on the way I hope:

Then you are ready to explore the use of the nodes in the workflow. You will have to do some trial and error to get to the results.

Another approach is to use LLMs to extract information from such a statement. This is an example with other data. In this case you will have to work on the statements and especially tell the LLM to output clean CSV or JSON files. Even Apple’s engineers seem to struggle with that so no shame in having to try a few times.

2 Likes