OpenAI and knime

Hi,

I was wondering about the Open AI capabilities of KNIME.

I found some basic models that were able to utilize openAI API to submit own table data, but I was wondering is there are example models of submitting own pdf/word data to open AI and getting results for a chatbot from there :)?

Hi,
Have you seen this workflow? It uses a PDF as knowledge base. First it creates OpenAI embeddings and a vector store for finding context to user queries, then formulates an answer using an OpenAI chat model.
Kind regards
Alexander

5 Likes

@AlexanderFillbrunn Thanks!! No, haven’t seen it before, will have a look! :slight_smile:

2 Likes

@Data_consumer I adapted this very workflow to show how to use the new GPT4All nodes to build local vector stores from the pdf with local models:

5 Likes

@mlauber71 thanks! I will try this out the coming weekend! :grin::raised_hands:

3 Likes

@AlexanderFillbrunn
Great example
How would you handle updating the vectorstore. E.g. data has been updated but the store is way to big to create it from scratch again. Now we only want to update the relevant vectors with new info
Also I assume the parser only extracts the text but not content in images and tables inside pdf files?
(I know these are advanced questions with probably no short simple answer but I curious to hear your thoughts if any)
br

2 Likes

Hi,
Yes, the parser only takes the real text, not text in images etc. For that you’d use something like Azure AI Document Intelligence, which does OCR.
For updating vector stores: these nodes are still pretty fresh and also pretty basic, so we do not have a “Vector Store Updater” yet. But I agree that this could be very useful. You could maybe use a Postgres DB with this plugin. But I have not seen it used in KNIME before.
Kind regards,
Alexander

3 Likes

@Daniel_Weikert I have discussed with some colleagues the use of extra packages to untangle images and tables from pdfs to use them in vector stores but this is a complicated business and pdf can be quite a complex format. And then the challenge is to interpret the content at the right place.

One thing to do could be to employ ChatGPT with this though I am not sure if the knime ports can handle the data formats.

1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.