Question about combining unstructured data with structured data

Hi Lawson,

Is there any specific reason why you want to use this two step approach? What are you doing for pre-processing since you still have 150k terms afterwards? Maybe some frequency-based filtering can help to first remove terms that occur only in a few or even in a single document.

As for your second question, do you expect the three features to have an impact? If yes, you should also expect that this impact shows in the model, no matter if it’s only 3 features.

Cheers,
Roland