Hello everyone, I am trying to create a predictive model that allows identifying the leads of a base that will be converted into sales, however, I have not been able to make it work optimally, the FN and FP are very high or in some cases it simply does not work the classification, I also tried to apply normalization and an oversampling technique but it didn’t work either, I also tried placing several algorithms and none of them are correct, I would like to ask for help since I don’t know how to correct it, what else to try…
A few comments:
1.Try using the AutoML verified component. Its well tested and may help you to understand whether you have model issues or its the way you’re preparing your data.
2. The meaning of your data isn’t clear, e.g. you’re using the Category to Number node to transform the AUX columns but then later normalizing these values which implies they have some meaning other than a simple category.
First of all, keep it simple. Don’t introduce too much complexity to soon.
Spent some time into analysing your feature set. What is their (expected) relation (if any) to your target?
What is the meaning of the values (the numbers now represented as strings, can you thread them as numbers or as strings).
Do some correlation checks, make plots.
Start with a relativly “simple” alghorithm, to find out if there is any prediction power in your data. Set a baseline and improve from here.
Because your target (venta Y/N) is somehow unbalanced, it tends to overfit. You can start to make a first model based on a 50/50 distribution of your target. This will give an idication if a prediction is possible (but beware it affects the distribtion of your data, and therefore metrics like precision and recall).