Load embeddings from file into FAISS Vector store creator

Hi everyone,

The faiss vector store creator node notes the following in it’s description:

By default, the node embeds the selected documents using the embeddings model but it is also possible to create the vector store from existing embeddings by specifying the corresponding embeddings column in the node dialog.

Unfortunately I can’t figure out what data type the node requires to read the embeddings. So far I’ve tried to convert the embedding string to a list, but the creator node does not allow to select that.

Is there documentation about the inner workings of the node?

1 Like

Hey @Ellison

You provide the model a list of embeddings. You can use the Text Embedder to create them or extract the embeddings from the Vector Store to a KNIME table using the Vector Store Data Extractor. Does this answer your question ?

All the Best
Linus

3 Likes

Hey LInus, thank you for your assistance.

I already have the embeddings in a simple csv created outside of knime. I now want to import these embeddings into the faiss vector store creator. I’ve interpreted the description in the way, that I can use the contents of a table. However I cannot select the column, containing the embeddings when formatted as a list.

Hello @Ellison,

Yes you can use the embedding in the faiss vector store creator node. Here is an example showcasing how the embedding created from the Text Embedder in the FAISS Vector Store Creator node.

Hope this helps.

Best,
Keerthan

1 Like

Hope this image helps to demonstrate what I aim to do. How can i convert the embedding in string format to the proper format required for the import into a vector store? The workaround right now is to separate the string type vector into separate columns and use a similarity search.

hello @Ellison,

Can you confirm what is the data type of your embedding column, the FAISS Vector Store Creator node expects a list of embeddings as shown in above example.

Best,
Keerthan

@k10shetty1
The embedding vectors are simple strings, as they come from an external resource. I’m looking for a way to convert these to an proper knime embedding, but I don’t know what data type that I need to convert to.

Cheers

Hello @Ellison,

Since embeddings are numerical vectors that represent the semantic and syntactic meaning of text. In KNIME, you can represent these vectors as Lists to allow nodes to process them effectively.

Can you share the embeddings here if it is okay? Someone can try to make it work.

Best,
Keerthan

Sure, I’ve prepared an example workflow.
import_embeddings_example.knwf (298.7 KB)

Hey there,

I think I get what you want to achieve:

  • you have embeddings in some source file and these are imported as strings
  • you are looking to move them to a FAISS vector store, w/o re- embedding them e.g. using an OpenAI Model

I’m afraid I don’t think that is possible with the current KNIME functionality.

KNIME did add some functionality with 5.4 that makes adding to and migrating existing vector stores easier:

But from what I can see in the examples this still requires that data is loaded from an existing FAISS vector store.

In your workflow you will have noticed that there’s a grey input port that is not connected in the FAISS vector store creator, which requires a connection to an OpenAI Embeddings Connector node.

I know that a vector store that was created can be “written” using a model writer node and then later on be read in again using a model reader node.

I am not sure though what the process would be to create a FAISS vector store (outside of KNIME, maybe with Python) and to then convert it to .model…

There are some articles on Medium by @mlauber71 on RAG topics - maybe that can help you, too:

2 Likes

Hello @Ellison,

In the example you shared the list was a collection of strings, I used a simple lambda function using a python script node to convert it to a list of doubles as below :

df['column'] = df['column'].apply(lambda x: [float(i) for i in x])

Now you can see the column appear in the FAISS Vector Store Creator node.

image

You have to connect to OpenAI Embeddings connector to use it further as shown here.

If you want to download the updated workflow, you can find it here.

Best,
Keerthan

1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.