Hi everyone,
Does anyone have an example workflow to create a model and use it with random forest regression?
Hi everyone,
Does anyone have an example workflow to create a model and use it with random forest regression?
Take a look at this. Very simple. No attempt at optimization, but shows basic model structure.
The workflow description should have the target as sales price, not sales type.
@arief_rama if you want to explore some more options you can take a look here:
@mlauber71 I’ve never been able to install vtreat. My Python skills are marginal at best. I’ve tried to find it on Anaconda with no luck. Any other hints? Thanks.
Hi @rfeigel … @mlauber71 …
Thank you … fyi, I’ll learn it first
I tried to pip vtreat and it says its installed, but when I try to run your workflow, the vtreat metanode says “no module names vtreat”.
(py39) C:\Users\genef>pip install vtreat
Collecting vtreat
Using cached vtreat-1.2.8-py3-none-any.whl (31 kB)
Requirement already satisfied: numpy in c:\users\genef\anaconda3\envs\py39\lib\site-packages (from vtreat) (1.24.3)
Requirement already satisfied: pandas in c:\users\genef\anaconda3\envs\py39\lib\site-packages (from vtreat) (2.0.2)
Requirement already satisfied: scipy in c:\users\genef\anaconda3\envs\py39\lib\site-packages (from vtreat) (1.10.1)
Requirement already satisfied: scikit-learn in c:\users\genef\anaconda3\envs\py39\lib\site-packages (from vtreat) (1.2.2)
Collecting data-algebra>=1.4.1 (from vtreat)
Using cached data_algebra-1.6.7-py3-none-any.whl (120 kB)
Collecting lark (from data-algebra>=1.4.1->vtreat)
Downloading lark-1.1.7-py3-none-any.whl (108 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 108.9/108.9 kB 3.2 MB/s eta 0:00:00
Requirement already satisfied: python-dateutil>=2.8.2 in c:\users\genef\anaconda3\envs\py39\lib\site-packages (from pandas->vtreat) (2.8.2)
Requirement already satisfied: pytz>=2020.1 in c:\users\genef\anaconda3\envs\py39\lib\site-packages (from pandas->vtreat) (2023.3)
Requirement already satisfied: tzdata>=2022.1 in c:\users\genef\anaconda3\envs\py39\lib\site-packages (from pandas->vtreat) (2023.3)
Requirement already satisfied: joblib>=1.1.1 in c:\users\genef\anaconda3\envs\py39\lib\site-packages (from scikit-learn->vtreat) (1.2.0)
Requirement already satisfied: threadpoolctl>=2.0.0 in c:\users\genef\anaconda3\envs\py39\lib\site-packages (from scikit-learn->vtreat) (3.1.0)
Requirement already satisfied: six>=1.5 in c:\users\genef\anaconda3\envs\py39\lib\site-packages (from python-dateutil>=2.8.2->pandas->vtreat) (1.16.0)
Installing collected packages: lark, data-algebra, vtreat
Successfully installed data-algebra-1.6.7 lark-1.1.7 vtreat-1.2.8
@rfeigel I can offer these two articles that might empower you to handle python and conda. Might be worth the time. The suggested yaml file has vtreat in the end.
About vtreat this article:
You will have to make sure to set up Python in KNIME and also have the configuration of the python node right. You can use the standard python environment or an individual one.
Thanks. I’m trying to absorb this.
Update. I’ve had few problems manually creating Python environments/updates using Anaconda. For some reason Anaconda won’t install vtreat. I had no idea how to use a yaml file. I used the conda install script you posted above and it worked perfectly. Thanks for the help. I’m still having problems running your regression workflow, but they’re unrelated to the environment with vtreat. When I get time I’ll try to resolve them and let you know how it works for me.
Here’s the error I’m getting:
@rfeigel for H2O nodes you can select the version in the KNIME preferences depending on the KNIME version you use. It should be 4.6 or 4.7 I would say.
That did the trick. Never occurred to me that the H2O version could be set in preferences. Kudos to you for this workflow. I couldn’t have done it in a lifetime.
What about the predictions of the AutoML Flow. Did it generate good results and what was predicted? Just curious
br
And what did you predict?
br
This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.