Hi there,
Need a little help/tip. I'm working with ~100K tweets as documents and created a Word Vector. I already reduced the volume of words used to about 1.5K, and I ran the Distance Matrix Calculate over 5K tweets as a test and it worked pretty well. But for the whole data set (100K tweets), it's very, VERY slow (I also don't have much memory, unfortunately). :)
Any tips on how to reduce the time taken for the distance matrix calculation?
Thanks!
Gustavo