Issues Using Hugging Face Models with KNIME AI Extension

Hello @benfin,

as the error states, models above 10GB are not supported in the inference API.
As this affects most useful LLMs, and we used some of those through the API in the past, I assume that this might be a newer limitation.
This is pretty understandable because larger models require more resources and somebody has to pay for those.
The new Inference Providers are one way through which you can access the necessary resources. There is a monthly free allowance but that is very limited and you will have to pay for any additional consumption.
At the moment we don’t have dedicated support for HF Inference Providers but since they have an OpenAI-compatible API, you should be able to use the OpenAI nodes to connect to them.

Another way of accessing the model is through dedicated HF Inference Endpoints, which can be deployed for many of the models via the model page.
Make sure to use Hugging Face Text Generation Inference as container type.
Then you can use the HF TGI nodes to connect to the models.
However, this is also a paid feature as you have to pay for the resources of the container that serves the model.

Best regards,
Adrian

3 Likes