hello together!
i am experimenting with the “faiss vector store” in conjunction with chatgpt.
I am not quite clear yet
when and how the “agend prompter” accesses the store.
which size the documents for the store may have
and what is the best method or preprocessing to prepare documents for the store.
My status:
if I read a “fictitious term” with a short explanation as *txt into the store, then the agend prompter uses this info.
If the same information is attached to a large pdf document, which is preprocessed and divided into lines (about 400) with about 2000 words per line, then the agend prompter does not refer to it.
is there any info about the “faiss vector store” and how best to fill it with content?
Hello Daniel,
this is my workflow and the table (csv) in which the split document is. this is also read in without problems. however, no good answer is given for the fictitious word “quademadelix” (located in the last line) in use case 2.
When I try your long document I get a rate limit error as I am on GPT free.
What I am missing is the Textsplitting before embedding. Is this handled automatically?
br
Just a short note, I am an advocate of LLMs and have similar ideas but you should be very careful in regards of “internal documents” an non private LLMs. Make sure that your company is really ok with this!
br