Word Vector Apply

iiiaaa · July 3, 2018, 11:38am

Dear all,
as you can see in the workflow attached the node “Word Vector Apply” applied to the same input as the “Doc2Vec Learner” produces a different vector. How is it possible? Could you help me?

Thanks
Regards

WordVectorApply.zip (1.1 MB)

Kathrin · July 3, 2018, 2:15pm

Hi,

The “Doc2Vec Learner” trains two things. A vector representation for each label and vector representation for each word. When you apply the "Vocabulary Extractor " node you get on one hand a dictionary (the vector representation for each word (first output port)) and on the other hand the vector representation for each label.

The “Word Vector Apply” node uses the dictionary you get at the first output port of the “Vocabulary Extractor” node and applies it to each document you feed in. Therefore you get a collection of vectors for each document, each vector representing one word in your document.

I hope that helps to understand it a bit better!
Best,
Kathrin

iiiaaa · July 3, 2018, 3:51pm

Dear Kathrin,
Many thanks for your answer, very helpful. Is there a way to ask the “Word Vector Apply” node to apply the vector representation of each label, instead of the vector representation of each word?

If not, how can I apply to a new record (Test Set) the vector representation of each label already calculated in the “Doc2Vec Learner”?

I had a look at the workflow example “08_Sentiment_Classification_using_Word_Vectors” and the “Partitioning” is done after the learning (“Doc2Vec Learner”). Does this mean that every time a new record arrives (TestSet) I have to concatenate the new record to the old historical data (Training Set) and to do again the learning phase?

Thanks in advance
Regards