different tables structure in learner and predictor

sadegh_cpu · September 2, 2013, 8:29pm

I am facing a problem in my predictor.

the problem is I have different table structure for learner and predictor.

learner has more attributes (columns) because this info is something generated in between.I show what I mean with this example:

we want to predict number of kids knowing info of father and mather.

in our training data,there are also attributes like "date of first infant born" "height of first infant born" and in general all the info about the first child of family born and we want to use this info for our predictor.

when we want to predict number of children in a family,we obviosly dont have any info about first child.

In decision tree predictor,I receive this error according to problem I mentioned above:

Learning column "first child date of birth" not found in input data to be predicted

thor · September 2, 2013, 9:28pm

You need to filter the columns that are not available in the unclassified datasets before you feed the training set into the learner. It doesn't work otherwise.

sadegh_cpu · September 2, 2013, 9:46pm

isn't there any way I use predictor to predict and generate values for unclassified dataset and then feed it to predictor another time?I really need those columns because they contatain very important part of database.OR maybe I can try another different method

Aaron_Hart · September 4, 2013, 11:14am

Hi Sadegh,

You can keep the data in a separate branch, but for the learner you will need to restrict your columns.

Aaron

micheljanos · April 6, 2017, 4:28pm

Hi Aaron and Thor,

This doesn't make sense to me.

For example, if you train a corpus on 1000 emails that gives say, 10000 terms.. You want to classify a new email that has a word that is not in this 10000 set. Then the classifier will ot work? Should it not just make tthe classification with the available words?

cheers,

Michel