How to merge vector stores?

Hi. I’m building an LLMops workflow using the RAG approach.

For RAG, I built a vector store with the FAISS Vector Store Creator node using a single pdf file, and for continued expansion, I plan to continue creating pdf files as vector stores to add to this workflow.

At this time, in order to continuously update new PDF files in one vector store, I must go through the process of reloading the files that were previously created in the vector store, merging them with the new files, and creating a new vector store.
The picture below is a workflow that attempted to use Concatnate by transforming the vector store into tool form. However, this method also requires the process of loading the existing vector store, and as the capacity of the existing vector store increases, loading becomes more difficult.

Instead of doing this, is there a way to append only the vectorized data of a new pdf file to the existing vector store, like appending a csv file with the same name in KNIME?

I believe this approach is the most efficient for continued scalability of LLMops. There is code available in Python, but it would be nice if it could be done with a KNIME node.

1 Like

Hey there,

it looks like with KNIME 5.4 there is a new functionality that makes updating vector stores simpler:

Simplified GenAI model maintenance for retrieval-augmented generation (RAG)

Managing and updating vector stores in RAG workflows can be complex and time-consuming, requiring manual effort to ensure the knowledge base stays accurate and aligned with evolving data.

KNIME Analytics Platform now comes with a new Vector Store Data Extractor node for updating and migrating vector stores to help you simplify the maintenance of knowledge bases used in RAG workflows, ensuring your data remains up-to-date and workflows remain efficient.

There are two example workflows available that I recommend to take a look at:

I think a similar solution was provided here:

Hope that helps you!

4 Likes