Filter forum posts based on date

Hi there,

 

I am collecting posts from a forum to conduct sentiment analysis. The issue I am having is that it is a slow process as there are many category and threads. One category could contain 100K posts and takes many hours to process. Is there any method/node I can use to filter the forums based on date of last post? I am only interested in posts from this year but most of the posts are from long before. The method I am using is as outlined for webcrawling in the Knime paper, http://www.knime.org/files/003_-_knime_boston2013_koetter.pdf.

 

Any suggestions would be greatly appreciated.

 

Best Regards,

 

Vincent

Hi Vincent,

I assume the forum pages you crawl are sorted chronologically, so why not just stop the crawl when you have exceeded the year in question?

Philipp

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.