Hii, I am exploring knime for my personal use, & I want to know if the model I built on knime can be rewritten or converted in any other format like parquet, MLflow etc. so that I can store it somewhere (like hdfs) but not in local machine.
many formats and conversion are either direcly integrated or can be added through extensions.
Well, you are telling me data table formats, but I am asking if my model/workflow can be converted or rewritten in any format so that I can store that model somewhere except local.
so what is your desired output format?
(i just gave you search examples for your keywords but can search for your drsired formats in knimehub/nodepit or the extensions)
And where you want to write it to?
You can use the connection nodes to write or copy files to many different storage sites
You can store your model configuration in a PMML format, see KNIME Hub for some examples https://hub.knime.com/search?q=pmml .
If II store my model in hdfs , in which format will it get saved?
I am using knime to learn more about end to end model life cycle management. I have build a practice model in knime and is trying to deploy it by storing somewhere else (like hdfs) and deploying it as I don’t have access to knime server. Please tell me if this is anyhow possible
use the HDFS Connection Node. to connect to your HDFS environment
Create a temp dir or use a temporal local folder
Use the model writer node to write your model to the temp folder
Use the upload node to upload the resulting model file to hdfs
An example how to use H2O models in a big data environment is here.
It should be possible to use the stored mojo model containers in other environments (think PySpark) although I have not widely tested it (I prefer to use KNIME workflows like the one in the example).
Concerning a production pipeline of models in a big data environment you might be interested in this example:
can I use AWS server for deploying my knime model?
maybe this blog post helps:
Welcome to KNIME Community!
I don’t have acccess to knime server. Can I use AWS server for deploying knime workflow?
Why would you want to run that on AWS? (vs local server). You can run knime server on AWS. If you don’t have knime server and want to “deploy” a worklfow, the only thing you can do is via batch mode and scheduled execution “windows scheduler” or “cron job” but with no user input.
Question is if you want to deploy a workflow or a model. If you don’t want to deploy with knime, use one of the non-knime specific model libraries, say from H2O integration or python integration (sckit-learn) or the XGBoost integration (if you write the model to disk with model writer you will find the core xgboost model in the zip file which you can load into a pure python based deployment of the model (say a web app).
Having said that these other options will quickly reach the cost of knime server anyway. Managing cron jobs for each new model, writing a custom web app for each new model…etc.
Simply it’s not simple to deploy models and will have a significant cost attached either directly monetary (KNIMEserver) or with personnel cost (programming + maintaining web apps,…) . If you want to do it right, there is no cheap solution. I actually think that deployment often is were the hype ends and the reality sets in.
This topic was automatically closed 182 days after the last reply. New replies are no longer allowed.