I am currently experiencing a weird problem with the Document Vector node and was asking myself if someone here could help me. I'm trying to convert a long list of terms and documents into Document vectors, which worked perfectly in the past (even with the workflow I use).
Enclosed you'll find two pictures (Input table and Output table of the Document Vector node sorted by document name).
The problem seems to be that the document vector apparently changes the value of the document cell. To be more precise: The document vector output should have one row for each document. In my case, the document "091-090" was changed to "091-089" -> The result is that the document vector output has two rows with duplicate document names.
this behavior is really strange and looks like a bug at first sight. So far i have never experienced it before. Could you attach a small workflow with data to reproduce this problem? This would help a lot. Currently i am on holidays, but when i am back i will take a closer look at it.
i investigated the problem and it turned out that it is a bug in the Document Vector node. The node does not change the document itself but assignes the next to last document in the table instead of the last document to the last vector row iff there is only one term per document in the input bag of words. I am working on it to get it fixed.