I am working on work flow to process PDF header files for temperature analysis. Attached is the workflow I’ve started with a file reader and PDF file example (However the topic forum would not let me uploaded it as a PDF so it is uploaded as a jpeg), Variable loop “To parse PDF recursively”, and then the bag of word to gather the list from each PDF. However I’m only looking for one value which is the " Máxima Temperatura Registrada" for 65 deg C.This reoccures in different files as Temperature Maxima or Bottom Hole Temp.
From the examples I’ve seen, most of them are looking word BoW dumps and frequencies with total amount of words. However how could you pin point the search to find one value from the whole PDF? I know I am on the right start for parsing the documents. Since the document is based in chart, am I supposed to do something new? I appreciate the ample amount of help.
I later want to gather the temperature points and plot them for scatter cluster plot and I have many files like this.
PDF_Dictionary_Tagger_frequencey.knwf (21.8 KB)