I downloaded 2 parts of the workflow (incl. the job posting data):
(1) How to build your FAISS vector store db
(2) How to build you ChatApp that give you answers form you db
I also download the job posting data hosted on kaggle that comes with this demo:
In the orginal article demo, you have only added “description” to the “FAISS Vector Store Creator” node.
To try to addon to the functionality of this chatbot, what i did was that i added a few more columns to the “FAISS Vector Store Creator” node. I selected, “Company”, “Title”, “Description” and “Location”. Then i re-run to get a new model file.
The only change i did to the workflow 2 demo to get this outcome was that i change the “Open AI Chat Model Section” to “GPT-4” and the Max response length (token) to 2000.
The reason that i am doing this is i would like to try this AI Chatbot Q&A on a large set of my personal data. And there will be more than 1 column that i would need to include into the FAISS Vector Store Creator.
From my demo as you can see in the screen recording, it looks like that AI Chatbot is not refering or referecing any data in the ventor store.
Anyone who has susscfully created an AI chatbot that can read data from multiple columns in a csv file for Q&A “fairly accurately”, please let me know the tips and tricks you have used and anything to look out for.
Hi @bluecar007 , thanks for checking our material and nice that you’re experimenting with GenAI in KNIME!
To answer your question:
Good news: in the current configuration, the workflow works exactly like it’s expected. We just wanted to retrieve job descriptions - we were not interested in locations or other details.
Let the agent use other metainfo: If you want the chatbot to also be able to answer questions about job location or other details (mind that company name is not a very informative column in this dataset, as it contains only a numeric code, so I would not expect interesting results), you need to:
Add additional columns as metadata in the Vector Store Creator node (like you did already).
In the chatbot workflow, you need to let the agent know that now there are more meta information it should be using to answer questions. To do that, you’d need to add an additional Vector Store to Tool node, and select “Retrieve sources from documents” where you have to specify the metadata column with locations. It’s also important to add a meaningful tool name and description, both hinting at the fact that this tool is now specialized in job location.
Concatenate the two tools using the Tool Concatenator node, and pass it to the ChatApp component (see screenshot below with edits in green annotation).
Important: mind that it might be needed to select a more powerful model in the OpenAI Chat Model Connector node to allow for larger context windows (e.g., GPT4o).
At this point, the agent finally knows where to look for information about the job locations and can answer your questions also on this (see screenshot below)
What I have just described is one option to address your question and to further customize the answer of your chatbot - making it more transparent how the agent uses tools to look for info.
Another option could be to wrangle the initial knowledge base, so that the “description” column would include directly info about the location and other details you’re interested in, and then just use that enriched “description” column to create the Vector Store. The workflow of the chatbot would then stay like in the original version, except that in the Vector Store to Tool node, you might want to enhance the tool descriptions a bit.