Is it possible to compare the structure of two PDF files?
I have monthly reports, which are automatically compiled from the database into PDF format. I am using the Python PyPDF2 library to read the data, and then directly compare it to the database report. to find possible differences or inconsistencies presented when passing the information to the PDF.
However, I must validate not only the data but also the format of the PDF, since in the data generator in the pdf, it sometimes generates inconsistencies, such as, for example, the data being outside the corresponding field, or that they are overlapped, I show you an example of what the correct format would look like, and how it looks corrupted:
Is there a way I can convert PDF to image and compare it against a standard format? To validate that the boxes are always in the same position, just like the data